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IMAGE  UNDERSTANDING  AND  INFORMATION  EXTRACTION 
Research  Summary 

This  report  summarizes  our  research  progress  for  the  period  May  1,  1976 
through  July  31 » 1976  In  Image  Understanding  and  Information  Extraction.  The 
objective  of  this  research  Is  to  achieve  better  understanding  of  image  structure 
and  to  improve  the  capability  of  Image  data  processing  systems  for  extracting 
Information  from  imagery  and  conveying  that  Information  In  a useful  form.  The 
results  of  this  research  are  expected  to  form  the  basis  for  technology  rele- 
vant to  military  applications  of  machine  extraction  of  information  from  air- 
craft and  satellite  imagery. 

Our  research  projects  can  be  categorized  into  six  heavily  overlapping 
areas:  Image  Segmentation,  Image  Attributes,  Image  Structure,  Image  Recogni- 

tion Techniques,  Preprocessing,  and  Applications. 

IMAGE  SEGMENTATION  - In  the  previous  quarterly  report,  we  described  a 
technique  for  accurately  estimating  edge  locations  which  Is  useful  for  appli- 
cations requiring  mensuration.  Burnett  and  Huang  have  continued  to  pursue 
this  work,  which  uses  a discrete  position  finite-state  Markov  process  model 
to  produce  accurate  width  estimates  from  blurred  and  nonlinear  observations 
in  the  presence  of  signal -dependent  noise.  It  is  shown  here  that  the  proposed 
algorithm  is  optimal  when  the  states  are  known  a priori.  Experimental  results 
are  given  for  the  case  where  the  states  are  estimated  from  the  available  data. 

Taking  a different  tack,  Yoo  and  Huang  report  a clustering  approach  to 
image  segmentation.  This  approach,  somewhat  different  from  earlier  approaches 
to  segmentation  by  clustering,  involves  four  relatively  distinct  steps: 

(1)  feature  extraction,  (2)  clustering  of  the  features  in  the  feature  space, 

(3)  transformation  of  the  clustering  results  back  into  the  image,  and  (I*)  seg- 
mentation based  on  cluster  boundaries  in  the  image.  Examples  of  applying  this 
approach  to  various  images  are  provided. 
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IMAGE  ATTRIBUTES  - We  continue  to  pursue  the  analysis  of  shape  and  texture 
In  Images.  Results  of  recent  progress  with  Fourier  shape  descriptors  appear 
In  the  APPLICATIONS  section.  Some  new  results  of  our  texture  research  are 
described  by  Carlton  and  Mitchell. 

IMAGE  STRUCTURE  - Tree  grammars  have  proved  to  be  a useful  approach  for 
characterizing  the  syntax  or  structure  of  images.  In  an  extensive  report 
Fu  and  Keng  describe  the  use  of  tree-grammatical  rules  for  the  description  of 
"objects"  such  as  highways  and  rivers.  By  using  additional  semantic  informa- 
tion, they  have  extended  their  method  to  the  problem  of  recognizing  bridges. 

Further  results  of  using  a syntactic  approach  appear  in  the  APPLICATIONS 
section. 

IMAGE  RECOGNITION  TECHNIQUES  - Pursuing  the  use  of  contextual  information 
for  statistical  classification,  Fu  et  ai.  have  discovered  that  the  form  of  the 
joint  probability  measure  defined  for  the  "random  field"  description  of  the 
image  must  meet  certain  functional  constraints.  This  may  not  prove  to  be  a 
serious  restriction;  however,  further  results  are  not  yet  available.  They  are 
also  developing  simulated  data  sets  which  will  help  to  evaluate  methods  pro- 
posed for  extracting  spatial  (2-dimensional)  infomration  f rom  mul tispectral 
remote  sensing  data. 

PREPROCESSING  - Two-dimensional  complex  cepstrum  analysis  has  been  shown 
previously  to  be  a means  for  stability  analysis  of  two-dimensional  recursive 
filters.  O'Connor  and  Huang  discuss  the  use  of  this  form  of  analysis  for 
enhancement  of  images  blurred  by  certain  point-spread  functions.  They  are 
developing  a software  realization  using  the  Fast  Fourier  Transform  and  have 
applied  it  successfully  to  filter  stability  analysis. 

Berger  and  Huang  have  experimentally  compared  two  methods  for  image 
restoration  in  the  presence  of  noise.  The  Projection  Method  and  the  Singular 
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Value  Decomposition  did  not  yield  very  different  results  In  the  cases  In- 
vestigated, over  a significant  range  of  noise  levels.  From  a practical 
standpoint,  however,  they  found  the  Projection  Method  can  utilize  a priori 
information  about  the  Image,  If  available,  and  Is  more  efficiently  applied 
to  images  of  larger  size. 

APPLICATIONS  - Proceeding  with  their  work,  reported  earlier,  using  Fourier 
shape  descriptors  for  the  analysis  of  airplane  shapes,  Wallace  and  Wintz  have 
developed  a normalization  method  which  Is  not  susceptible  to  noise  problems 
as  have  been  previously  reported  methods.  They  are  now  anticipating  the  in- 
tegration of  their  method  with  automatic  boundary-finding  procedures  in  order 
to  automatically  detect  and  recognize  airplanes. 

Dang  and  Huang  report  some  preliminary  results  from  their  work  in  locat- 
ing airports  In  LANDSAT  imagery.  They  are  using  a combination  of  spatial 
frequency  filtering  and  syntactic  analysis. 

Mitchell  describes  the  directions  being  pursued  in  the  recognition  of 
tactical  targets  In  FLIR  imagery.  This  is  a joint  project  with  Honeywell 
Systems  and  Research  Division. 
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IMAGE  MEASUREMENT 
J.W.  Burnett  and  T.S.  Huang 

I.  Introduction 

Our  last  report  [1]  showed  how  a discrete  position  finite  state  Markov- 
process  model  could  be  used  in  conjunction  with  the  Viterbi  algorithm  (VA)  to 
produce  accurate  width  estimates  from  blurred  and  nonlinear  observations  in  the 
presence  of  signal  dependent  noise.  This  report  shows  the  algorithm  is  optimal 
when  the  states  are  known  a priori,  and  presents  experimental  results  on  the 
algorithm's  performance  when  the  states  are  estimated  from  the  available  data. 
The  Optimality  of  the  VA  for  Discrete  Step  Edge  Location 

A 

Recall  [1]  that  a sequence  £will  be  decided  over  any  other  sequence  £ 

when 

p(Z|£)  P(£)  > P(Z|C)  P(£)  (1) 

As  before  let  all  permissible  sequences  be  equally  likely.  Then  (1)  becomes 

A A 

(for  all  permissible  £ and  £)  decide  5 over  £ if 

p(Z|E)  > p(Z|0  (2) 

It  is  well  known  [2]  that  this  decision  rule  minimizes  the  probability  of 

A 

deciding  sequence  ? when  £ is  the  correct  sequence.  Now  for  step  edges,  each 

A 

£ uniquely  corresponds  to  an  edge  location.  Thus  the  estimate  of  a step  edge 
location  produced  by  the  VA  (with  the  stated  assumptions)  is  the  minimum  prob- 
ability of  error  estimate.  Therefore  if  ir  is  the  probability  that  the  edge 
is  mislocated  a points  by  the  VA  and  is  the  probability  the  edge  is  mis- 
located  a points  by  any  other  technique  hq  £ir  . Let  n be  the  number  of 
points  the  edge  is  mislocated  by  the  VA.  Let  n (n*+  1,  +2,  ...)  be  the  number 
of  points  the  edge  is  mislocated  by  any  other  technique.  Then 
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Equation  (3)  shows  that  on  the  average  the  accuracy  of  the  VA  cannot  be  improv- 

j. 

ed  upon.  Further  if  |n|  = |n  ' | then 


Var  ( f n | ) - Var  (|n*|)  = Z ( [ ot  | - |n|)2TT  - E (|a|  - |n  |)2tt 

a a 
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Za  (tt  - it  ) > 0 
a a — 

a 


(4) 


Thus  the  variance  of  the  edge  location  estimates  produced  by  the  VA  cannot  be 
bettered  by  any  technique  that  is  as  accurate. 

The  Optimality  of  the  VA  for  Discrete  Width  Measurement 
With  the  assumption  of  independent  edges  let 


_ , * x * * 

Pr  (n  =a)  = Z tt„  it  „ 
w g 6 a 6 


(5a) 


and 


P = Pr(n  = a)  = Z it  tt" 
a w g p a-p 


(5b) 


As  noted  in  the  previous  section  tt^  tt^  for  any  y.  Thus  term  by  term  the 
summation  in  (5a)  is  less  than  or  equal  to  the  summation  in  (5b).  Therefore, 


p < p . 
ro  — ra 


(6) 
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By  arguments  similar  to  those  in  the  preceding  section 


n < E n 
1 w'  — 1 w' 


Var  (|n“|)  <_  Var  (|nj) 


providing  E | n^ | = E | | . These  last  two  equations  show  that  the  V A produces 
the  most  accurate  and  minimum  variance  discrete  width  estimates  possible. 


Relation  to  the  Continuous  Case 

A single  measurement  by  the  VA  will  be  an  integer  number  of  sample  points. 
Since  there  is  no  physical  reason  the  size  of  any  given  object  will  be  an  in- 
teger multiple  of  the  sampling  interval  some  constraint  must  be  placed  on  the 
interval  between  points  to  prevent  loss  of  measurement  accuracy  due  to  the 
discreteness  of  the  estimates. 

Suppose  the  width  of  an  object  is  considered  to  be  a continuous  variable 
and  that  a maximum  likelihood  estimate  of  the  continuous  variable  is  available. 

A 

If  the  signal  to  noise  ratio  is  high  enough  the  estimate  w^.  produced  by  this 
technique  will  be  normally  distributed  with  a mean  value  equal  to  the  true 
width  and  a variance  given  by  the  Cramer-Rao  lower  bound  on  the  variance  of  an 

A 

unbiased  estimate  of  w [3].  Since  w^.  is  normally  distributed  with  95.^%  con- 

A A 

fidence  the  true  width  can  be  expected  to  fall  in  the  interval  [w^  - 2cr£R,  w^.  + 
20^]  [3]  where  0^  is  the  variance  predicted  by  the  Cramer-Rao  lower  bound. 

If  for  a particular  problem  w^  is  31.5  sample  points  and  o^.R  is  .1  sample 
points,  it  is  reasonable  to  expect  the  true  width  is  somewhere  between  31.3 
and  31.7  sample  points.  The  closest  width  estimate  the  VA  could  produce  would 
be  31  or  possibly  32  sample  points. 

A A 

Assume  for  the  moment  the  probability  that  the  VA  decided  wp(wD  an  integer) 
sample  points  was  the  probability  the  continuous  measurement  fell  in  the 


Interval  [w^,  w^+l]  sample  points.  The  probability  of  deciding  32  sample  points 
would  be 

Pr(wD<=32)  - Pr(wc  > 32)  = Pr(wc  - 31.5  > .5) 

wf-  31.5 

= Pr (— — i > 5)  = 0 

Similar  arguments  may  be  made  to  show  Pr(wp=  29)  - 0.  Thus  with  high  probabil- 
ity repeated  measurmenets  will  all  have  a value  of  31  sample  points  so  that  un- 
less there  is  "jitter"  in  the  sample  positions  the  bias  will  not  be  reduced  by 
averaging  several  measurements. 

The  problem  can  be  avoided  by  specifying  that  the  interval  AX  between  the 
samples  is  sufficiently  small.  As  a rough  guide 

AX  <_  2ocr  (9) 

seems  reasonable.  This  rule  limits  the  bias  due  to  the  discreteness  of  a 
single  measurement  to  a maximum  of  a^.R  and  at  least  makes  it  possible  for  some 
reduction  In  bias  to  occur  by  averaging  several  measurements. 

Simulated  Width  Measurement 

A pulse  with  a width  of  thirty  sample  points  simulating  a scan  line  across 
an  object  to  be  measured  was  generated  on  a computer  (see  Fig.  1).  The  line 
was  convolved  with  a Gaussian  line  spread  function.  The  blur  had  a standard 
deviation  of  one  sample  point  and  was  normalized  so  that  the  coefficients  sum- 
med to  unity.  Each  blurred  intensity  sample  bR  was  transformed  to  a density 
sample  yR  by 

yR  - log  bk  + .15  (10) 

to  simulate  the  D - log  E curve  of  film.  Independent  normally  distributed 
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Figure  1 Ideal  reflected  light  intensity 


noise  samples  with  zero  mean  and  standard  deviation 


1 

T 


were  added  to  each  value. 


(11) 


Different  SNR's  were  produced  by  varying  the  intensity  levels  of  the  orig- 
inal pulse.  For  this  example  the  SNR  was  defined  as 


SNR 


Imaxlmum  change  In  density  due  to  signal  | 
maximum  noise  standard  deviation 


(12) 


Some  blurred  and  noisy  lines  are  shown  in  Fig.  2. 

The  simulation  was  performed  with  the  intensity  levels  assumed  to  be  un- 
known. Eighteen  training  samples  from  each  of  the  two  density  levels  were 
selected.  For  high  SNR's  the  selection  of  training  samples  that  are  well  away 
from  edge  locations  is  easily  accomplished  by  inspection  of  a singie  scan  line. 

At  low  SNR's  selection  of  training  samples  is  easily  done  by  inspection  of  the 
light  intensity  pattern  for  a SNR  of  2.25  (see  Fig.  3). 

A 

The  training  samples  were  averaged  to  produce  density  estimates  h(a^)  and 

* A A 

h(a2).  These  density  levels  were  converted  to  intensity  levels  aj  and  a2  with 
the  D - log  E relationship 

^-10  "’K1  - •IS) 

(recall  the  blurring  coefficients  sum  to  one).  The  sample  mean  is  not  the 

A 

optimal  estimator  of  the  density  levels  h(a^)  for  this  problem.  The  maximum 
likelihood  estimate  for  example  would  make  use  of  the  knowledge  that  the  variance 
of  the  samples  also  depends  on  the  density  levels.  However,  this  estimate  re- 
quires the  roots  of  a polynomial  be  found.  It  '»"is  decided  the  additional  com- 
puter time  required  to  do  root  finding  would  probably  not  be  worth  the  decreased 
varinace  of  the  estimate  and  so  the  sample  mean  was  chosen. 
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The  results  are  shown  In  Figures  A and  5.  Two  hundred  noise  sample  func- 
tions were  used  at  each  SNR.  Figure  k shows  that  estimating  the  levels  from 
the  available  data  has  very  little  effect  on  accuracy.  Figure  5 shows  the  var- 
iance of  the  estimate  Is  increased  slightly  as  expected  from  [1], 

A Cyl Inder  Problem 

Consider  a polished  metallic  cylinder  on  a black  background  of  "infinite" 
extent.  Illumination  is  from  a light  source  "infinitely"  far  away  so  that  all 
light  rays  striking  the  cylinder  make  a uniform  angle  <t>  with  the  horizontal 
(see  Figure  6).  The  observer  is  located  directly  over  the  cylinder  and  "In- 
finitely" far  away. 

The  light  intensity  in  the  direction  of  a reflected  ray  is  [6] 

reflected  intensity  = CRj(a)  (13) 

where  C is  the  incident  light  intensity  and  Rs(a)  is  the  ratio  of  reflected 
energy  to  incident  energy  as  a function  of  the  incident  angle  a.  By  the  assump- 
tion on  the  observer's  position  and  distance  the  light  intensity  in  the  direction 
of  the  observer  is  CRs(a)  cos  S where  S is  the  angle  between  the  reflected  ray 
and  the  vertical.  The  problem  is  to  measure  the  radius  of  the  cylinder. 

The  problem  of  measuring  the  radius  has  several  interesting  aspects.  First, 
the  ideal  reflected  light  intensity  In  the  direction  of  the  observer  Is  not  con- 
stant but  varies  with  position  on  the  cylinder.  Further,  surface  areas  of  the 
cylinder  that  are  not  directly  illuminated  by  the  light  source  cannot  be  seen 
due  to  the  assumption  of  a black  background.  Finally  since  no  diffuse  re- 
flection occurs  (due  to  the  polished  surface  assumption)  the  areas  of  the  cyl- 
inder that  are  directly  illuminated  but  do  not  reflect  incident  light  toward 
the  observer  also  cannot  be  seen. 

At  any  position  X along  a scan  line  across  the  cylinder  the  normal  vector 


spe 
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to  the  cylinder  surface  will  make  an  angle  0 with  the  horizontal  where  (from 
Figure  6) 
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cos  0 


Wj  - X 
wT 


or 


-1  W1  " X 
0 = COS  (— !— .) 


W„ 


Wj  » center  position  of  the  cylinder 
w 

0 = radius  of  the  cylinder 
The  angle  of  incidence  a is  given  by 
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a = 0 - $ 


and 


(14b) 


s = 0 + a - 2.  - 20  - ♦ - J 

observed  intensity  = CR$(a)  cos  S 

A cylinder  based  on  the  model  of  equation  (14)  was  generated  and  Is  shown 
In  Fig.  7 A scan  line  across  the  cylinder  is  shown  In  Fig.  8.  The  center 
point  Wj  is  at  position  46,  wQ  is  15  sample  points  and 
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The  cylinder  was  convolved  with  a normalized  Gaussian  line  spread  function  with 
a standard  deviation  of  two  sample  points.  The  blurred  intensity  samples  were 
converted  to  density  samples  with  equation  (10)  and  noise  with  standard  devi- 
ation given  by  equation  (11)  was  added.  A blurred  and  noisy  scan  line  is  shown 
In  Fig.  9.  If  the  radius  was  the  only  unknown  quantity  a minimum  cost 
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This  minimization  can  be  performed  by  inspecting  the  available  data  to  establish 


ranges  in  which  and  can  be  expected  to  fall,  choosing  an  initial  value  for 


Wg  from  its  possible  range,  and  then  finding  a value  of  w ^ (from  within  its  pos- 


sible range)  that  minimizes  (18).  This  procedure  is  repeated  with  the  next  pos- 


sible value  for  w^.  The  costs  of  the  two  values  for  wQ  are  compared  and  the 


value  of  Wq  that  has  the  lowest  cost  is  stored  as  w^.  Iteration  proceeds  by 


finding  the  value  of  w^  that  minimizes  the  cost  of  the  next  possible  value  of  wQ. 


The  minimum  cost  of  each  possible  value  for  wQ  is  compared  with  the  cost  of  wQ. 


If  the  cost  of  Wq  is  less  than  the  cost  of  wQ,  wQ  becomes  the  new  wQ. 


Twenty  independent  measurements  of  a cylinder  radius  were  made  using  the 
technique  described  above.  The  results  are  shown  In  Table  1.  The  range  of 
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Table  1 

Radius  Measurements 


A 

A 

True  radius 

wo 

Variance  of  w^ 

15 

14.8 

.16 

possible  values  for  was  taken  as  ten  to  twenty  sample  points  and  the  center 
position  Wj  was  assumed  to  be  between  sample  points  forty  and  fifty.  Approxi- 
mately seventy-five  seconds  of  computer  time  were  required  for  the  twenty 
measurements. 

Measurement  of  a Road 

A 1:5000  scale  black  and  white  negative  taken  with  Kodak  Plus  X Aerographic 
film  was  obtained  and  digitized  on  a flying  spot  scanner.  The  sampling  rate  was 
ninety-six  samples  per  millimeter  and  the  data  was  quantized  to  16  bits  though 
only  the  first  330  levels  were  occupied.  The  scene  is  shown  in  Figure  10  and 
shows  an  intersection  of  two  gravel  roads  in  Warren  County,  Indiana  (the  white 
spot  on  one  of  the  roads  is  due  to  a parity  error  on  a magnetic  tape).  Figure 
11  shows  a close-up  of  one  of  the  roads  and  Figure  12  shows  a scan  line  across 
the  road  of  Figure  11.  Five  hundred  training  samples  from  one  of  the  roads 
showed  the  average  density  was  .942  with  a variance  of  .00213.  One  thousand 
training  samples  from  the  field  surrounding  the  road  had  an  average  density  of 
.669  with  a variance  of  .00236.  The  nominal  film  properties  were  obtained  from 
Tarklngton  [4]  and  Paris  [5].  The  frequency  response  of  the  image  blur  was 
assumed  to  be  the  product  of  the  film  frequency  response 
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Figure  11  Road  Close-up 
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and  the  response  of  an  ideal  diffraction  limited  lens  with  a cutoff  frequency 


of  half  the  sampling  rate: 


T2(f)  "|(cos-,(f  ) - f / l-(f  )2  ) 
c c c 


f * 1*8  cycles/mm 

Figure  13  shows  the  line  spread  function  corresponding  to  equations  (19)  and 

(20). 

Ten  independent  measurements  of  one  of  the  roads  were  made  and  the  results 
shown  In  Table  2.  The  variance  of  each  y^  was  taken  to  be  .00213  if  y^  corres- 
ponded to  a state  where  i^  was  a sample  from  the  road  or  .00236  if  y^  correspond- 
ed to  an  i.  from  the  field, 
k 

The  variance  of  the  digital  measurement  was  1.15  sample  points  squared  and 
the  uncertainti ty  indicated  in  Table  2 represents  plus  or  minus  two  standard 
deviations. 

An  optical  measurement  was  made  with  a magnifier  and  reticle  marked  in 
tenths  of  millimeters.  Table  2 shows  the  results  and  uncertaintity  of  this 
measurement. 

The  site  of  the  road  was  visited  and  the  width  fround  to  be  18'-11"  with 
a tape  measure.  There  is  a fair  amount  of  uncertaintity  connected  with  this 
measurement.  The  edges  of  the  road  are  characterized  by  vegetation  which  can 
overhang  or  encroach  upon  the  road  by  several  inches  on  either  side.  Measure- 
ments on  similar  roads  varied  from  18'  6"  to  19'  10".  Therefore,  the  true 

width  of  the  road  the  day  the  photograph  was  taken  is  not  known  exactly. 

2 

The  Cramer-Rao  bound  Oj.R  was  calculated  assuming  the  density  levels,  var- 
iances and  line  spread  function  used  by  the  VA  were  correct.  This  variance  was 

2 

1.23  sample  points  squared  which  is  reasonably  consistent  with  c VA« 


Table  2 

Road  Width  Measurement 

Results 

Method 

Road  Width 
on  Film 

Width  on 
Ground 

VA 

110.2  s.p.  + 2.1 

5.7**  m + . 1 1 
(18* 10"  + A") 

Optical 

1 . 1 5mm  + .05 

5.75m  + .25 
(1 8' 10"  + 10") 

Tape 

Measure 

5.76m  + .15 
(18' 11"”+  6") 

Finally,  the  effect  of  the  cutoff  frequency  f In  equation  (20)  was  ex- 
amined. Line  spread  functions  for  different  values  of  f^  were  calculated  and 
the  ten  measurements  were  repeated.  The  results  are  shown  in  Table  3. 


Table  3 

The  Effect  of  f on  Wdith  Estimates 
c 


f 

c 

(cycles/mm) 

Width 

(sample  points) 

Variance 

56 

N‘  110.3 

1.20 

48 

110.2 

1.15 

40 

109.8 

1.1 

32 

109.1 

.96 

Table  3 Indicates  the  width  estimates  produced  by  the  VA  are  not  overly  sensi- 
tive to  Imperfect  knowledge  of  the  degrading  system. 
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IMAGE  SEGMENTATION  BY  CLUSTERING 
M.  Y.  Yoo  and  T.  S.  Huang 


I . Introduction 


Two-dimensional  photographic  images  consist  of  several  fundamental  pic- 
torial components.  Each  component  has  a more-or-less  different  property  or 
characteristic  to  human  visual  perception.  We  may  roughly  call  these  compon- 
ents textural  components  or  simply  textures.  It  is  almost  impossible  to  de- 
scribe the  textures  precisely  because  of  the  abundant  variety  of  them  in  the 
real  world,  whereas  It  is  highly  desirable  to  have  even  a rough  measure  to 
distinguish  these  textural  components. 

Zucker  [ 1 ] tried  to  model  the  textures  based  on  the  concept  of  "primi- 
tive" and  regular  or  quasi-regular  patterns,  but  his  approach  is  far  from 
being  practical.  A more  practical  approach  to  texture  is  Haralick's  [2]  ap- 
plication of  textural  features  for  Image  classification  based  on  the  spatial 
dependence  matrix. 

The  set  of  possible  descriptions  of  a picture  is  often  so  large  that  it 
is  impractical  to  describe  the  picture  by  assigning  it  to  an  element  of  this 
set.  Instead,  a more  practical  description  may  often  be  given  by  partitioning 
the  picture  into  objects  and  assigning  each  of  these  objects  to  one  element 
of  a set  of  possible  descriptions  of  objects. 

Segmentation  is  the  partitioning  of  images  into  several  basic  textural 
components,  each  of  which  has  significantly  different*  properties,  (statistical, 
topological).  There  have  been  several  approaches  in  this  direction.  I Fisher 
[3l  tried  to  partition  the  picture  function  Into  a "unimodal  subset"  which 
means  conceptually  a subset  having  only  one  "hill"  In  the  intensity  values  of 
the  points  in  the  subset  and  Gupta  [4]  and  Kettig  [5]  adopted  the  statistical 
hypothesis-testing  of  local  mean  and  variance  to  detect  the  boundaries  in 


closed  forms  and  applied  this  approach  to  data  compression  and  classification. 

The  human  visual  system  is  an  excellent  textural  discrimination  and 
Julesz's  [6]  experiments  show  that  not  only  the  statistical  but  also  the  topo- 
logical properties  of  images  are  important  factors  to  textural  discrimination. 
So  the  best  textural  discriminator  is  the  combination  of  statistical  measures 
and  topological  properties  of  images.  Topological  properties  usually  are  de- 
scribed by  either  sytactic  methods  or  some  algebraic  measures. 

We  are  not  at  this  point  in  a position  to  combine  the  statistical  de- 
scription with  the  syntactic  description  in  an  appropriate  way;  for  the  present 
we  are  mainly  concerned  with  the  pure  statistical  or  the  pure  algebraic 
approach.  The  approach  which  we  propose  consists  of  extracting  pair  features 
using  a 3x3  moving  window,  "eyeball  clustering"  of  features  and  back-trans- 
formation of  the  feature  plane  onto  the  original  picture  domain.  This  approach 
is  motivated  by  the  different  types  of  pair  feature  distributions  for  16 
different  textures  shown  in  Brodatz  [7].  (See  Fig.  1 for  four  examples;  the 
horizontal  axis  is  sample  standard  derivation  and  the  vertical  axis  denotes 
sample  mean.  The  size  of  the  feature  plane  is  6i*x6k.) 

1 1 . The  Image  Segmentation  Algorithm 

The  image  segmentation  algorithm  which  we  propose  consists  of  three 
major  steps:  (1)  feature  pair  extraction,  (2)  clustering  of  features,  and 

(3)  segmentation.  The  feature  extraction  is  the  most  important  step  and  is 
the  extraction  of  a certain  measure  which  represents  the  local  characteristics 
of  the  image  is  a reasonably  simple  form. 

The  clustering  of  features  is  quite  dependent  upon  the  features  chosen 
in  the  first  step.  If  Ideal  features  were  chosen,  the  features  are  well 
clustered  in  the  feature  plane  and  clustering  is  trivial,  otherwise  some 
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heuristic  clustering  algorithm  or  "eyeball  clustering"  should  be  adopted. 

The  segmentation  Is  the  back-transformation  of  the  clustered  features  onto  the 
original  picture  domain  in  the  segmented  image  form.  The  way  of  presentation 
of  segmented  pictures  may  be  either  displaying  the  boundaries  between  different 
textural  components  of  the  image  or  display  of  different  major  textures  in 
separate  picture  domain. 

In  many  applications  (data  compression,  shape  description,  classification, 
etc.)  the  location  of  boundaries  between  different  textures  is  Important  in- 
formation so  we  have  chosen  the  display  of  boundaries. 

The  detailed  description  of  the  three  steps  will  follow. 

A.  The  Feature  Extraction 

The  extraction  of  features  is  a very  important  part  in  image  segmenta- 
tion. The  best  features  may  be  the  detailed  description  of  the  structural 
relationship  between  the  selected  picture  array  and  its  surroundings,  but  this 
is  very  complex  to  be  Implemented  for  computer  processing.  A reasonable  mea- 
sure which  is  significantly  simplified  but  still  contains  major  information 
about  the  selected  picture  array  is  the  statistical  or  the  structural  char- 
acteristics within  a window  of  appropriate  size  centered  within  the  array.  We 
may  lose  some  Information  by  this  simplification  but  in  some  sense  this  ap- 
proach is  more  reasonable  than  detailed  information  for  the  textural  discrimi- 
nation. Natural  images  usually  consist  of  many  or  several  textures  which  are 
smoothly  varying  in  shape  within  the  same  pattern  rather  than  strictly  webbed 
and  every  picture  array  may  be  identified  as  a different  texture  by  the  de- 
tailed description. 

Multidimensional  features  require  significantly  increased  memory  size  in 
the  computation  and  it  is  also  Impossible  to  see  the  clustering  in  the  hyper- 
feature space  so  we  have  worked  with  feature  pairs.  The  three  feature  pairs 


which  we  have  tried  are  the  following: 

1.  The  local  sample  mean  and  the  local  sample  standard  deviation. 

(x-m,y-n)  (x-m,y+n) 


n 

m (x,y) 

(x+m,y-n)  (x+m, y+n) 

Figure  2 Local  window  used  for  local  operations 

Let  P(x,y)  be  the  picture  array.  Then  the  local  sample  mean  M(x,y)  and  the 
local  sample  standard  deviation  S(x,y)  at  the  array  point  (x,y)  based  on  a 
local  window  size  mxn  are: 

. x+m  y+n 

M(x,y)  " Tfm+D  (2n+l ) 1 1 

U“x-m  CT“y-n 

. , i x+m  y+n  1/2 

S(x>y)  ’ J2, g| ) (2n.l ) E 1 - «(x,y)}2 

g=x-m  a=y-n 

2.  Local  minimum  and  local  maximum: 

Let  D = {(a, 8):  a * x-m,  ...,  x+m,  8 ■ y~n,  ...,  y+n}  then  the  local 
minimum  and  the  local  maximum  are 

MIN  (x,y)  = minimum  P(a,8) 

(a, 6)  e D 

MAX  (x,y)  = maximum  P(ot,8) 

(a, 8)  e D 

3.  The  number  of  the  local  "jumps"  and  the  average  magnitude  of  the  jump: 

We  compare  two  adjacent  points  (in  all  directions)  in  the  local  window 


and  tf  the  grey  level  change  Is  greater  than  a preassigned  threshold  value  we 
assume  there  Is  a jump.  When  we  count  the  number  of  "jumps"  we  consider  the 


"jump  ups"  and  "jump  downs"  and  two  successive  "jump  ups"  or  "jump  downs"  are 
counted  as  one  jump  with  a large  amplitude.  We  take  the  total  number  of  jumps 
In  the  local  area  as  feature  1 and  the  average  magnitude  of  a jump  as  feature 
2. 


feature  1 = total  number  of  jumps  in  the  local 
image  of  size  mx  n 

feature  2 = the  average  magnitude  of  a jump  in 
the  area 

B.  The  Clustering  of  Features 

There  are  many  different  approaches  [8-12]  reported  for  clustering  data. 

A simple  and  practical  approach  is  ISODATA  (Iterative  Self-Organizing  Data 
Analysis  Technique  [9]).  In  this  algorithm  several  important  initial  cluster- 
ing centers  are  picked  up  and  assigned  levels.  Each  sample  point  is  merged 
into  the  nearest  center  based  on  the  Euclidean  distance.  Based  on  the  initial 
grouping,  new  clustering  centers  are  calculated  and  if  the  new  clustering 
centers  are  the  same  as  the  old  ones  the  clustering  process  is  terminated; 
otherwise  the  same  kind  of  merging  process  is  repeated. 

The  Euclidean  distance  may  be  modified  if  the  extracted  features  have 
significantly  different  magnitudes  to  avoid  the  "masking  effect"  of  the  dom- 
inant feature,  as  follows: 

d = X/1(M2-M,)2  + w2(a2  - a,  )2 

Where  M.,  a.  are  the  mean  and  standard  deviation  at  each  sample  point.  The 
Wj's  are  weights.  ISODATA  is  easy  to  implement  but  does  not  give  smooth 
boundaries  in  the  feature  plane. 
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The  graph  theoretic  approach  is  a more  elaborate  clustering  algorithm 
[12].  For  large  sample  size  this  algorithm  becomes  basically  the  same  as  the 
valley  seeking  approach.  The  algorithm  gives  smooth  boundaries  but  requires 
an  extremely  large  memory  size  for  two  dimensional  data  sets  and  is  not  prac- 
tical for  large  samples  (61*  x 64  is  the  largest  data  size  for  practical  appli- 
cation but  further  reduction  of  data  is  necessary  for  efficient  use).  "Eyeball 
clustering"  is  the  most  accurate  and  is  very  flexible  because  we  have  complete 
control  of  the  data.  We  look  at  the  data  and  cluster  them  in  arbitrary  groups 
based  on  our  previous  experience.  "Eyeball  clustering"  is  used  for  this  exper- 
iment because  automatic  clustering  is  practically  impossible  by  the  ISODATA  or 
graph  theoretic  clustering  technique  for  very  large  images. 

C.  The  Segmentation 

This  is  the  back-transformation  of  clusters  in  the  feature  plane  onto  the 
textural  components  in  the  original  picture  domain.  Different  clusters  in  the 
feature  plane  correspond  to  different  textures  in  the  original  image  domain 
and  the  number  of  textural  components  depends  upon  how  many  clusters  we  allow 
in  the  feature  plane.  An  alternate  way  of  segmenting  images  is  locating 
boundaries  between  different  textures. 

Resulting  boundaries  form  closed  contours  except  minor  isolated  or  clus- 
tered noisy  points  when  we  set  boundaries  between  different  textural  components. 
Once  the  clustering  of  features  is  complete,  the  textural  discrimination  in 
the  picture  domain  is  determined  and  how  we  transform  the  clusters  back  onto 
the  picture  plane  does  not  affect  the  locations  of  the  textural  components. 
Therefore,  there  is  no  preferred  direction  in  texture  discrimination  processing 
(compare  with  the  BLOBS  [13]).  More  clusters  in  the  feature  plane  give  finer 
boundaries  in  the  picture  plane  and  this  algorithm  is  a type  of  parallel  top- 
to-bottom  approach. 
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lit.  Experimental  Results 

We  calculate  feature  pairs  at  each  picture  array  point  using  a 3x3 
moving  window  centered  at  the  array  and  normalize  and  quantize  features  to 
appropriate  levels.  The  number  of  the  local  jumps  is  a fairly  small  Integer 
and  does  not  have  to  be  quantized,  but  other  features  are  quantized  to  128 
levels  such  that  the  maximum  levels  of  a feature  Is  128. 

Computer  plots  of  the  feature  planes  are  shown  in  Fig.  3 to  Fig.  10.  Only 
major  clusters  show  up  in  the  computer  line  print  outs  and  our  "eyeball  clus- 
tering" is  based  on  those  of  the  line  print  outs,  and  these  gould  electrosta- 
tic prints  are  used  in  conjunction  with  line  printer  outputs  of  the  same  In- 
formation to  locate  the  initial  cluster.  Typical  line  printer  output  cor- 
responding to  Figs.  3 and  4 are  shown  in  Figs.  11  and  12. 

The  density  In  the  feature  plane  denotes  the  total  number  of  picture 
points  in  the  original  image  plane  which  have  certain  feature  pairs  corre- 
sponding to  the  coordinates  of  the  feature  plane.  If  there  is  a dense  cluster 
(Fig.  3)  in  the  feature  plane  which  may  correspond  to  a large  textural  com- 
ponent of  the  original  picture,  some  other  small  clusters  corresponding  to 
small  textural  components  do  not  show  up  in  the  feature  plane  and  renormali- 
zation of  data  excluding  the  data  contributing  to  the  dense  cluster  Is  nec- 
essary to  see  minor  clusters  (Fig.  4).  Actually  in  many  cases,  the  feature 
plane  has  several  clusters  which  give  major  boundaries  in  the  pictorial  plane. 

The  f omentation  based  on  the  local  jump  is  not  quite  adequate  for  the 
pictures  "Girl"  and  "Professor"  and  we  did  not  include  them  in  this  report. 
Sample  clustering  regions  are  indicated  in  Figs.  13  and  14.  The  regions  shown 
in  Fig.  13  are  based  on  the  clusters  in  Fig.  11.  The  regions  In  Fig.  14  are 
based  on  the  combination  of  the  original  dense  cluster  (Fig.  11)  and  the  sub- 
clusters (Fig.  12).  Each  clustering  level  in  each  feature  plane  was 


transformed  back  Into  the  original  picture  domain  and  the  corresponding  picture 
array  is  given  the  same  level.  So  different  textures  in  the  picture  domain 
have  different  levels  corresponding  to  the  levels  of  the  feature  plane.  We 
arbitrarily  set  a boundary  on  the  picture  point  with  the  higher  level  when  two 
different  levels  are  adjacent.  Boundaries  of  three  different  pictures  (missile, 
girl,  professor)  based  on  three  different  feature  pairs  are  shown  in  part  (A) 
of  Figs.  16  to  19  and  Figs.  21  and  22,  and  Figs.  24  and  25.  Part  (B)  of  Figs. 

16  to  19  show  the  over  display  of  the  boundaries  on  the  original  picture  and 
gives  some  idea  about  the  accuracy  of  the  algorithm. 

The  boundaries  shown  in  Fig.  17(A)  are  the  combination  of  the  major  boun- 
daries (see  Fig.  16(A))  and  the  boundaries  detected  based  on  renormalized  sub- 
features (see  Fig.  k or  12).  The  single  isolated  noisy  points  are  dropped  in 
all  cases  except  in  Figs.  16,  18,  and  19.  For  Fig.  19  we  used  a threshold 
value  of  30  when  we  calculated  the  feature  pairs. 
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Feature  plane  for  sample  textures 
Abcissa  is  standard  deviation. 
Ordinate  is  mean. 


Figure  k Sub  Feature  Plane  (missile) 

Horizontal  Axis:  Standard  Deviation 

Vertical  Axis:  Mean 

Size:  128  x 128 


Figure  5 Feature  Plane  (missile) 

Horizontal  Axis:  Local  Maximum 

Vertical  Axis:  Local  Minimum 

Size:  128  x 128 


Figure  11  Feature  Plane  (missile) 

Horizontal  Axis:  Standard  Deviation 

Vertical  Axis:  Mean 
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Sub  Feature  Plane  (missile) 
Horizontal  Axis:  Standard  Deviation 

Vertical  Axis:  Mean 
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Figure  13  Clusters  In  Feature  Plane 

Horizontal  Axis:  Standard  Deviation 

Vertical  Axis:  Mean 

Size:  128  x 128 
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Clusters  In  Feature  Plane 
Horizontal  Axis:  Standard  Deviation 

Vertical  Axis:  Mean 

Size:  128  x 128 
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Figure  1* 
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(A)  Boundary 


(B)  Segmentation 

Figure  17  The  segmented  picture. 
Mean  and  standard 
deviation  used. 
Subfeature  used. 


(B)  Segmentation 


Figure  19  The  segmented  picture. 

Number  of  local  jump 
and  average  amp  of  the 
jump  used. 

Threshold  = 10 


Figure  21  Boundary.  Mean  and 


standard  deviation 
used. 


Figure  22  Boundary.  Local  MIN 
and  local  MAX  used. 
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TEXTURE  EDGE  DETECTION  USING  MAX-MIN  DESCRIPTORS 
S.  G.  Carlton  and  0.  R.  Mitchell 


I.  Introduction 

This  report  is  the  sequel  to  the  Nov.  75  - Jan.  76  ARPA  Interim  Report 
Article  "Texture  Edge  and  Classification  Using  Max-Min  Descriptors".  Since 
that  time  research  concentration  in  this  area  has  been  on  improvement  of  the 
window  averaging  technique  used  previously. 

Application  results  of  the  min-max  descriptors  and  the  improved  window 
averaging  are  also  discussed. 

I I . Window  Averaging  (Previous) 

Previously,  the  window  averaging  used  in  conjunction  with  the  min-max 
descriptors  Involved  a variable  window  size,  dependent  on  picture  context,  and 
required  two  separate  processing  runs  which  provided  averaging  results  in  both 
the  horizontal  and  vertical  directions.  In  the  horizontal  case,  the  averaging 
technique  compared  the  total  extrema  within  a window  to  the  right  of  each 
pixel  point  with  the  total  in  a like  window  to  the  left  of  each  points.  Each 
point  is  processed  individually  in  this  way  and  replaced  by  the  average  cal- 
culated as 


. „ Rtot  ~ Ltot 

H „ + L„  . + K 
tot  tot 


This  average  calculation  gives  maximum  values  at  pixels  located  on  the  boun- 
dary between  texture  regions.  This  same  average  was  then  computed  using  like 
windows  located  above  and  below  each  pixel.  The  two  averages  were  then  com- 


bined and  resulting  maximum  were  marked  as  texture  edges. 


: 

! 
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III.  Window  Averaging  (Now) 

The  window  averaging  technique  currently  In  use  with  the  mln-max  descrip- 
tors still  uses  variable  window  size  based  on  picture  context,  but  now  the 
horizontal  and  vertical  averages  are  computed  simultaneously,  providing  faster 
and  more  accurate  results. 

In  this  technique,  the  window,  NxN,  is  centered  over  a particular  pixel 
and  four  quadrants,  N/2  x N/2,  are  formed  as  shown  below: 


B 

A 

C 

D 

The  resulting  average  calculation  used  to  replace  each  individual  pixel  value 
in  the  resulting  averaged  picture  is 

(B.C)  - (A+D)  * (B+A)  - (C+D)  * scaie 

2N2 

The  use  of  the  absolute  divisor  based  solely  on  the  window  size  is  because 
the  relative  divisor  in  the  previous  method  suffers  from  a disturbing  quality: 
an  off-center  edge  produced  a bigger  output  then  an  on-center  edge.  The 
absolute  divisor  does  not  suffer  this  problem. 

Although  this  averaging  technique  works  very  well,  further  research  into 
other  possibilities  is  continuing. 

IV.  Appl i cat  ions 

As  a test  for  the  max-min  texture  technique,  an  image  scanned  from  the 
North  East  Test  Area  was  used  as  input.  Figure  1 shows  the  512x512  black 
and  white  Image.  Figure  2 shows  the  detected  extrema  with  intensity  used  to 
Indicate  the  size  extrema  detected.  Note  that  the  forested  area  has  a large 
number  of  extrema  of  all  sizes.  A relatively  simple  forest  detector  can  be 


created  by  thresholding  the  total  number  of  extrema  within  a 60 x 60  window 


surrounding  each  point.  The  output  of  this  is  shown  as  full  white  in  Fig.  3 


The  texture  edges  are  detected  using  the  window  averaging  described  above 
Outputs  from  this  program  using  window  sizes  of  60x60  and  18x18  are  shown 
in  Figs.  4 and  5»  respectively.  The  local  maximums  in  Fig.  4 are  shown  super- 
imposed on  the  original  in  Fig.  6.  These  represent  the  edges  between  large 
(at  least  30x  30)  texture  regions.  Using  the  various  size  windows,  we  are 


developing  a hierarchical  structure  of  texture  edge  detection.  We  are  also 
developing  an  edge  detector  based  on  combined  texture  and  Intensity  informa 
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A SYNTAX-DIRECTED  METHOD  FOR  LAND-USE 
CLASSIFICATION  OF  LANDSAT  IMAGES 


K.  S.  Fu  and  J.  Keng 
I.  INTRODUCTION 

This  research  is  motivated  by  the  need  for  a method  which  can  fully 
automate  land-use  classification,  such  as,  highway,  river  and  bridge 
recognition,  from  LANDSAT  images.  The  statistical  pattern  recognition 
techniques  which  have,  been  developed  up  to  now  have  not  shown  satisfactory 
results.  For  instance,  the  land-use  classification  of  LANDSAT  images  has 
been  studied  by  Todd  and  Baumgardner  using  spectral  analysis  [i].  It  has 
been  shown  that  highways  and  other  concrete  areas,  such  as  parking  lots, 
could  not  be  distinguished  from  each  other  due  to  the  fact  that  both  have 
similar  spectral  characteristics  in  the  spectral  analysis.  This  report 
introduces  a method  of  spatial  analysis  for  the  same  purpose  of  land-use 
classification  without  encountering  the  difficulties  mentioned  above. 

Spatial  analysis  in  picture  recognition  problems  can  be  treated  by 
syntactic  approach  [2],  Recently,  utilization  of  syntactic  method  to 
describe  spatial  relationships  among  different  objects  was  suggested  by 
fu  [3].  Some  related  research  has  been  done  on  LANDSAT  images.  Braycr 
and  Fu  [l|]  recognize  a city  scene  by  constructing  a hierarchical  graph 
model  which  contains  spatial  distributions  of  all  classes  in  the  scene. 

Web  grammars  are  used  to  describe  spatial  relationships  between  various 
objects  in  the  scene.  Li  and  Fu  [5]  started  with  pointwise  statistical 
classification  of  LANDSAT  images;  then  applied  tree  system  approach  to 
LANDSAT  data  interpretation.  Bajcsy  and  Tavakol i [6]  designed  a computer 
program  from  the  relational  graph  viewpoint  to  recognize  objects  from 


The  research  undertaken,  here,  applies  to  land-use  class! ficat ion  of 
LANDSAT  images  such  ns  highway,  river  and  bridge  recognition.  The  suggest 
ed  approach  is  a Syntax-Directed  Approach  [7].  The  method  based  on  this 

r 

approach  utilizes  a set  of  tree  grammar  rules  to  describe  the  objects  of 
Interest,  such  as  highways  and  rivers.  Accompanied  by  utilizing  semantic 
Information,  the  application  of  this  method  is  extended  to  the  problem  of 
bridge  recognition. 

The  LANDSAT  system  (formerly  the  Earth  Resources  Technology  Satellite 
"ERTS")  consists  of  three  major  components;  two  spacecraft,  the  remote 
sensors,  and  the  ground  data  handling  system  [8],  The  overall  system  v/as 
designed  to  perform  three  functions;  the  acquisition  of  mul ti spectral 
Images,  the  collection  of  data  from  remotely  located  sensors,  and  the 
production  of  photographic  and  digital  data.  There  are  four  channels; 
channel  1 (wavelength  0.5-0.6  micrometer),  channel  2 (wavelength  0.6-0. 7 
micrometer),  channel  3 (wavelength  0.7”0. 8 micrometer)  and  channel  k 
(wavelength  0. 8-1.1  micrometer).  The  first  two  channels  are  visible 
bands  and  the  latter  two  are  infrared  bands.  LANDSAT  images  are  given  in 
a digitized  form  by  NASA  with  spatial  resolution  of  one  pixel  correspond- 
Ing  to  79x56  (meters)  on  the  earth. 
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II.  SYNTAX-DIRECTED  METHOD 

The  proposed  syntnx-d i reeled  method  involves  t he  fol lowing  steps: 

An  inference  process  is  applied  to  a set  of  training  imagery  data  to  infer 
a set  of  grammatic  rules  which  in  turn  formalizes  a syntactic  model.  Based 
upon  this  model,  a set  of  most  probable  window  patterns  (which  are  gen- 
erated by  this  grammar)  is  implemented  to  analyze  the  test  images  and  to 
recognize  the  objects  of  interest. 

2.1  Inference  Process 

The  grammatical  inference  process  is  a man-computer  interactive  system. 
Based  on  the  knowledge  of  highway  structures,  several  initial  tree  grammar 
rules  are  written.  Then  a training  area  is  selected  (which  in  this  case 
is  Lafayette,  Indiana).  The  training  image  is  processed  by  the  initial 
set  of  grammar  rules.  An  existing  highway  map  is  also  provided  for  the 
purpose  of  comparison  with  the  processed  result,  for  the  highway  struc- 
tures which  exist  in  the  map  but  not  in  the  processed  result,  the  grammar 
rules  to  generate  those  structures  are  added  to  the  initia'l  set  of  grammar 
rules,  and  the  image  is  processed  to  test  this  hypothesis.  After  several 
Interactive  steps  the  final  set  of  grammar  rules  is  obtained.  The  prim- 
itives for  the  grammar  are  chosen  as  a,  b,  c,  d,  e,  f,  g,  and  h.  These 
primitives  are  designed  by  a 2x2  pixel  block. 
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P al  e i 

SB 

h*-*- 

For  example,  the  window  area  of  an  intersection  of  highways  is  such 


The  inferred  grammar  is 


S -v  $ 


A -*  d 


A d 

/\ 

B C 


B -*•  c 


C -*■  g 


This  inference  process  continues  and  at  most  has  possible  patterns. 
The  resultant  grammar  rules  are  of  course  too  many.  We  design  a simpli- 
fied tree  grammar  analyzer  using  a window  operation  analyzer.  The  move- 
ment of  the  window  is  to  shift  one  column  or  one  row  at  a time.  Then 
multibranch  patterns  can  also  be  represented  by  one-branch  grammar  rules. 
For  example,  the  window  pattern  mentioned  above  can  be  analyzed  as  the. 


following  two  window  patterns. 
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The  one-branch  grammar  rules  are  (A,,  A,,  A,,  AL  corresponds  to  the 
grammar  rules  presented  later). 

(1)  S -*•  $ A,  -*■  d 

i 3 i 

a3  A 

(2)  S -*•  $ A.  -*•  c 

i ' i 

Aj  A 

For  a junction  of  two  highways  a: 


it  can  be  analyzed  by  two  window  patterns  that  are  generated  by  one- 
branch  grammar  rules.  The  two  patterns  are 


* *■  j •» 


a3  d 


\ - c 


I “ I 


\ - c 


A.  c A -*-  c A.  -*■  c 
I | 2 | 2 


> follows 


Then  the  resultant  grammar  rules  can  be  expressed  in  terms  of  only  one- 
branch  tree  grammar  rules  as  follows*. 

The  tree  grammar  Gt  is  Gt  - (V,  r,  P,  S)  where 


V ■ {S,  $,  Aj , A^,  Aj,  A^,  Aj,  Ag,  Ay,  Ag,  A^,  A^,  a,  b,  c, 
d,  e,  f,  g,  h} 

r (a)  - r (b)  * r(c)  - r(d)  « r(e)  r(f)  = r(g)  = r(h)  = {1} 

*Strlctly  speaking,  the  simplification  essentially  reduces  the  tree  grammar 
to  a string  grammar.  However,  the  spirit  of  syntax-directed  method  is 
still  preserved. 


Jiti  i Xf  h 
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A training  area  is  set  to  establish  the  thresholds  of  the  spectral  intensity 
of  concrete  areas  (more  precisely,  concrete  - like  areas)  in  channels  1 and  2. 
Then  a threshold  H is  obtained  from  the  sum  of  two  thresholds  from  these  chan- 
nels. Watery  areas  exhibit  a very  low  response  in  the  infrared  bands.  This 
contrast  makes  the  extraction  of  watery  areas  from  channels  3 and  A easier. 

The  same  procedure  of  threshold  finding  as  that  for  concrete  areas  is  applied 
here  to  obtain  threshold  R for  river  recognition. 


2.2  Syntax-Directed  Method 

The  syntax-directed  method  consists  of  two  levels,  namely,  transformation 
process  and  tree  grammar  analysis.  The  transformation  processor  transforms 
the  mul tispectrai  images  into  a single  binary  image.  The  tree  grammar  ana- 


lyzer then  analyzes  the  transformed  image  based  on  a tree  grammar.  Structures 


which  are  generated  by  the  tree  grammar  are  accepted;  otherwise,  they  are 
rejected.  The  detailed  processes  are  illustrated  as  follows: 
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2.2.1  Transformation  Process 

2. 2. 1.1  Thresholding  Process.  First  the  model  of  LANDSAT  images 
is  defined  in  the  Euclidean  n-d?mensional  space  En.  The  number  n 
represents  the  number  of  channels  to  be  chosen.  A pixel  is  described  by 

an  ordered  n-tuple  (Xj ,X^ Xn).  The  transformation  process  works 

such  that  if  the  sum  of  spectral  intensities  of  the  points  in  the  same 
position  in  two  channels  is  greater  than  the  sum  of  the  two  thresholds 
from  training  area  (for  instance,  channel  l,  2,  and  threshold  H for  high- 
way recognition,  and  channel  3,  h,  and  threshold  R for  river  recognition), 
the  position  is  set  to  1;  otherwise,  it  is  set  to  0.  Thus,  mul t i spectral 
images  are  transformed  to  a single  binary  image.  (For  river  recognition, 
the  one-zero  settings  are  inverse.) 

It  is  true  that  both  visible  bands  (channel  1 and  2)are  sensitive 
to  the  concrete  spectra.  But  in  real  world  images,  the  influence  of 
neighboring  objects  sometime  cause  the  deformation  of  the  object  of  in- 
terest (such  as  highway).  But  when  there  is  only  one  channel  (image) 
available,  the  thresholding  process  can  be  designed  by  setting  the  thres- 
hold on  one  image.  Experiments  of  this  case  were  also  conducted  and  it 
showed  that  by  using  the  sum  of  the  spectral  intensities  of  two  visible 
channels  (for  highway)  one  obtains  a more  reliable  result  than  that  by 
just  setting  a threshold  on  one  channel. 

2. 2. 1.2  Line  Smoothing  Process.  After  the  thresholding  process,  a 
line  smoothing  technique  is  applied  to  remove  deformation  and  reestablish 
continuity  of  the  lines.  For  a given  center  pixel  of  a 3x3  window. 


•*,w  [i>p. 
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the  operation  starts  from  the  left  upper  corner  pixel.  If  it  is  one,  the 
column  is  shifted.  If  it  is  zero,  the  surrounding  eight  pixels  are 
checked.  If  there  exists  at  least  two  "l's"  which  are  not  adjacent  to 
each  other,  then  a "1"  is  set  on  the  center  position.  The  operation 
continues  until  reaching  the  rightmost  column  of  the  digitized  image. 

Then  the  operation  is  shifted  one  row  down  and  starts  from  the  left  most 
column  with  the  same  process  until  reaching  the  last  row  of  the  digitized 
image. 

2.2.2  Tree  Grammar  Analysis 

Input  ; The  transformed  binary  image  which  is  a Q.( I , J)  memory  array. 

Output:  The  syntax-directed  analysis  result  on  land  use  classifica- 

tion. 

Algori thm : 

Step  1:  Set  G(M,N)  to  be  an  operation  window  (8x8  in  size). 

Step  2:  Load  the  array  of  Q(l,J)  where  J = 1,  8;  I — 1,8 

on  the  operation  window  G(M,N). 

Step  3;  Compare  the  operation  window  with  a set  of  most 
probable  window  patterns  (see  Fig.  1)  which  are 
generated  by  the  tree  grammar  Gt»  If  it  belongs 
to  that  set  of  patterns,  the  primitive  pattern  in 
that  window  is  accepted,  and  stored  in  the  resulting 
memory  array  R(I,J).  If  it  does  not  belong  to  that 
set  of  patterns,  then  go  to  step  A. 

Step  4 i Shift  one  column  to  the  right  of  Q(l,J)  in  step  3. 

Then  go  to  step  3 and  continue  until  reaching  the 
right  most  column. 


2.2.1  Transformation  Process 


2. 2. 1.1  Thresholding  Process.  First  the  model  of  LANDSAT  images 
is  defined  in  the  Euclidean  n-dimensional  space  En.  The  number  n 
represents  the  number  of  channels  to  be  chosen.  A pixel  is  described  by 
an  ordered  n-tuple  (X^ , . . . ,Xn) . The  transformation  process  works 
such  that  if  the  sum  of  spectral  intensities  of  the  points  in  the  same 
position  in  two  channels  is  greater  than  the  sum  of  the  two  thresholds 
from  training  area  (for  instance,  channel  1,  2,  and  threshold  H for  high- 
way recognition,  and  channel  3,  *♦,  and  threshold  R for  river  recognition), 
the  position  is  set  to  1;  otherwise,  it  is  set  to  0.  Thus,  mul t ispectral 
images  art  transformed  to  a single  binary  image.  (For  river  recognition, 
the  one-zero  settings  are  inverse.) 

It  is  true  that  both  visible  bands  (channel  1 and  2)are  sensitive 
to  the  concrete  spectra.  But  in  real  world  images,  the  influence  of 
neighboring  objects  sometime  cause  the  deformation  of  the  object  of  in- 
terest (such  as  highway).  But  when  there  is  only  one  channel  (image) 
available,  the  thresholding  process  can  be  designed  by  setting  the  thres- 
hold on  one  image.  Experiments  of  this  case  were  also  conducted  and  it 
showed  that  by  using  the  sum  of  the  spectral  intensities  of  two  visible 
channels  (for  highway)  one  obtains  a more  reliable  result  than  that  by 
just  setting  a threshold  on  one  channel. 

2. 2. 1.2  Line  Smoothing  Process.  After  the  thresholding  process,  a 

> ■•*#  smoothing  technique  is  applied  to  remove  deformation  and  reestablish 
■ mm'  ■ • * n*  the  lines.  For  a given  center  pixel  of  a 3x3  window. 
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Step  5 • Shift  one  row  downward  In  step  3;  go  to  step  3,  until 
reaching  the  last  row  of  the  digitized  image.  Syntac- 
tically correct  structures  a re 'recognized  and  stored 
In  the  resultant  memory  array  R(I,J). 

Step  6 : Output  the  result,  R(t,J),  which  is  the  result  of 
the  syntax-directed  method. 

The  flow  chart  for  highway  recognition  by  the  syntax-directed  method 
is  given  in  Figure  2.  Since  the  rivers  are  similar  linear  features  to 
highways,  the  inferred  tree  grammar  for  highway  can  be  used  also  for 
river.  This  assumption  is  justified  by  the  results  of  the  experiments 
on  river  recognition.  The  flow  chart  for  river  recognition  is  also  pro- 
vided in  Figure  3. 


2.3  Use  of  Semantic  Information 

Spectrally  speaking  bridges  have  similar  characteristics  to  concrete 
parking  lots,  urban  housing,  and  highways.  These  aspects  make  statistical 
techniques  .inadequate  for  bridge  recognition.  The  idea  proposed  is  to  ) 
use  the  spatial  relationship  to  distinguish  highways  from  other  concrete 
areas  by  the  syntax-directed  method,  and  then  to  use  semantic  information 
to  distinguish  the  bridges  from  detected  highways. 

Fi  rst  the  images  are  processed  by  syntax-directed  method  for  high- 
ways and  rivers.  Then  a semantic  processor  is  designed  which  sequentially 
processes  semantic  rules  as  follows: 

(j)  Bridge  pixels  are  highway  pixels  overlaying  water  areas 
(river,  lake,  or  gulf). 

(fi)  Bridge  pixels  never  exist  singularly  in  the  continent. 
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Figure  3 The  flow  chart  of  river  recognition  via  the 
syntax-directed  method. 
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(ill)  Both  ends  of  the  bridge  are  always  connected  to  the  high- 
ways. 

A flow  chart  of  the  bridge  recognition  is  in  Figure  4.  The  length 
of  the  recognized  bridge  can  also  be  calculated  by  the  following  algo- 
rithm. 

Algorithm  of  calculation  of  bridge  length: 

Input;  recognized  bridge  result. 

Output;  the  calculated  length  of  bridge. 

Algorithm; 

(1)  Calculate  the  number  of  horizontal  rows  which  have  at 
least  one  bridge  pixel.  The  value  is  a. 

(ii)  Calculate  the  number  of  vertical  columns  which  have  at 
least  one  bridge  pixel.  The  value  is  b. 

(iii)  If  a equals  one,  the  length  of  the  bridge  c is  bx56 
meters. 

If  b equals  one,  the  length  of  the  bridge  c is  a x 79 
meters. 

Otherwise  go  to  (iv) 

(iv)  The  length  of  the  bridge  c = /"  (ax79)2  + (bx5&)2  . 

The  idea  for  this  algorithm  Is  to  calculate  the  longest  side 
of  a right  triangle,  and  every  pixel  in  LANDSAT  images  is 
about  79  meters  in  vertical  length  and  §6  meters  in  horizontal 
length  on  the  earth.  Step  (iii)  are  the  cases  when  bridge  is 
right  on  the  horizontal  row  or  vertical  column.  The  coordinate 
for  the  locations  of  the  recognized  bridges  are  also  located 
by  recording  recognized  bridge  pixels. 
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Figure  **  The  flow  chart  of  bridge  detection  via  the  syntax 
directed  method  with  semantic  process 


III.  EXPERIMENTAL  RESULTS 


The  syntax-directed  method  has  been  implemented  by  FORTRAN  pro- 
gramming on  the  IBM  360/67  computer  at  the  Laboratory  for  Applications 
of  Remotely  Sensing  (LARS).  The  experiments  have  been  conducted  on  dif- 
ferent LANDSAT  images.  Only  one  training  data  set  (from  Lafayette  area) 
was  used  for  all  the  experiments. 

3.1  Highway  Recognition 

Pig.  5(a)  is  a LANDSAT  image  over  the  Indianapolis,  Indiana  area. 

Pig.  5(b)  is  the  intermediate  output  after  line  smoothing  process  in  the 
transformation  process.  The  highway  recognition  result,  by  the  syntax- 
directed  method,  is  shown  in  Fig.  5(c).  The  area  is  a 96x96  image  which 
shows  the  junction  of  interstate  highway  65  (northwest  to  southeast 
direction)  and  highway  k(>5  (north  to  south  direction)  in  the  left  upper 
part  of  Fig.  5(d).  The  experimental  result  shows  that  the  syntax-directed 
method  Is  rather  successful. 

3.2  River  Recognition 

For  the  purpose  of  showing  that  this  method  works  also  for  rivers,  a 
terrian  area  northeast  of  San  Francisco,  California  was  processed  by 

the  syntax-directed  method  for  river  recognition.  The  LANDSAT  image  is 
Fig.  6(a).  The  river  recognition  result  by  the  syntax-directed  method, 
Fig.  6(b),  shows  that  it  successfully  recognizes  a winding  river  in  that 
linage.  The  size  of  the  image  is  also  96x96.  The  topographic  map  for  the 
same  area  is  shown  in  Fig.  6(c). 
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Figure  5(b)  Intermidiate  output  after  line 
smoothing  process  on  Figure  5(a) 


86 


i 


I 


I J 


H 

HH 


MM 

HH 

HHHHHhh 

HH 


HH 

HH 

HH 

HH 

HH 

hhh 

hhhh 

HHHH 

HHHH 

HHHH 

HHHH 

hhhhh 

Hhhhh 

hhhh 

hhhh 

hhhh  hhhhh 
rHHHHHHIiHHM 
HHHHHHHHHH 
HH 
HH 


HHH 

HH 

HH 

HH 

HHHH 
H HHHH 

hhhhhh 

H HHHHH 
H HHHHH 


HH 
HH 
HH 
HH 
HH 
HH 
HH 
MH 
HH 

hm 
hh 
HH 
HH 
HH 
HH 
HH 

HH 
H HH 
HHH 
HHH 
HHH 
HHH 
HHH 
HHH 
HHH 

hhhh 
, hhhhhh 
HHHHHH*  HhHHHHM 
HHHHHHHHHH  HHH 
HHHH 
HHH 
HHH 
HH 
HH 
HH 
HH 
HH 
HH 
HH 
HH 
HH 
HH 
MH 
MH 
hh 
HHH 

hhhhhh 
HHHHHHHHHH 

HHHHHH  MMHHMhhHHHHHHH 

hhhhhh 

HHH  Hh'hH  HHHHHHHHHH 

hhmhhhhhh 

H HHhHMhH 

HM  hhhhhh 
hhhhhhhhm 

Hhh  HHHHH 


H 

, HHHHH 
HHHHHhhhliH 

hjjhmhhhhh 


HHH 
HHHMHHHM 

HHNHHHHHHHHHH 

HH 


HHH 
HHHH 
HH  HH 
HH  HH 
HH  H 


MHNHHHHHHHHHHHhHMH 


HM 

HH 

HHH 

hhh 

Hhh 

HH 

HH 


HH 

HH 

HH 

H 

HH 

hh 

hm 

hm 

hh 

hh 

H 

MH 

hh 

hm 

hhh  h 

mmmh 

Hhh 

hum 

hhh 

HHHHH hhhh 

hhmhhhhhh 

hhhh 

hhh 

hhh 

hhh 

HH 

MH 

HH 


HHHH 

HHH 

HHH 


HHHH 

HHH 

HHHH 

HHHHH 

HHh 


HhH 
HH 
HH  . 

HHH  H 

hhmhh 

HHHHH 

•HHHHH 

HHHHH 

Hflhii 

HHH 

Hh 

Hh 

Hh 

HH 

HH 


HM 

Hh 

hh 

Hhh 

Hhh 

hhhh 


n”nn 

HHH  HHHHH 
hhhhhh  Hil  l 

HHHHHHrtM 

HhHHHHHh 

hhhhhmhh 

HHMHHHHHH 

HHHHHHHHHHHHi 

MHHhhHHHHHHI 

I 

H 

M 

MHMF 


Figure  5(c) 
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Figure  6(a)  Satellite  image  of  north  part  of 

San  Francisco  Bay  area,  California 
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Figure  6(b)  River  recognition  result  of  Figure  6(a) 
by  syntax-directed  method 
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3.3  Bridge  Recognition 

Fig.  5(a)  is  a satellite  image  over  Indianapolis,  Indiana.  Fig.  7(a) 
is  a topographic  map  of  its  lower  part.  It  shows-  in  the  left  part  of  the 
map,  that  there  is  a bridge  over  the  Cagle  Creek  Reservoir.  The  bridge 
recognition  result  by  the  syntax-directed  method  given  in  Fig.  7(b)  shows 
that  the  birdge  is  successfully  recognized  and  the  length  is  calculated 
to  be  672  meters.  In  the  left  lower  part  of  Fig.  7(a)  the  scale  of  the 
map  is  provided.  The  length  shown  in  map  is  about  the  same  as  that  found 
by  our  method.  The  coordinate  of  the  bridge  can  also  be  automatically 
located  by  this  method. 

A third  experiment  . i s on  the  Lafayette  area  of  which  the  LANDSAT  image 
is  shown  in  Fig.  8(a).  Fig.  8(b)  is  a city  map  segment  of  the  Lafayette 
area  which  shows  a small  bridge  on  Highway  1-65  over  the  Wabash  River. 

This  LANDSAT  image  has  been  processed  by  the  syntax-directed  method  for 
bridge  recognition  and  the  result  is  shown  in  Fig.  8(c).  The  bridge  is 
recognized  in  the  right  lower  part  of  the  image  and  its  length  is  calcu- 
lated to  be  ^5^.1  meters.  The  coordinates  of  location  of  the  bridge  is 
also  given  in  Fig.  8(c). 

This  information  extraction  (bridge  length  and  coordinate)  not  only 
contributes  to  the  understanding  of  images  and  may  also  aid  the  automatic 
guidance  missile  system  in  locating  accurately  the  objects  of  interest. 
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Figure  7(a)  Topographic  map  of  lower  part  of  Figure  5(a) 
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»M€  CALCULATED  LENGTH  Of  TMf  DETECTED  BRIDGE  F ROM  SATELLITE  PICTURE 


672.0  METERS 


Figure  7(b)  Bridge  recognition  of  satellite 
by  syntax-directed  method 


image  Figure  5(a) 
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VHl  CALCULATED  LENGTH  OF  THE  DETECTED  BKlOGE  E KUM  SATELLITE  PICTURE  • 


454.1  METERS 


Figure  8(c)  Bridge  recognition  result  of  satellite  image 
Figure  8(a)  by  syntax-directed  method 
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IV.  CONCLUSIONS  AND  REMARKS 


The  syntax-directed  method  is  implemented  by  the  "LOGICAL  program- 
ming technique"  on  binary  images.  So  the  software  manipulates  most  opera- 
tions on  the  machine  logic  level.  Comparative  studies  are  also  carried 
out  with  two  different  implementations.  One  uses  the  logical  programming 
and  the  other  does  not.  The  one  with  logical  programming  saves  30%  of  the 
CPU  time  comparing  with  the  other.  Computer  processing  time  for  the  syntax- 
directed  method  is  rather  fast  compared  with  the  previous  related  work. 

For  a 96x96  images,  the  proposed  method  takes  only  approximately  26  seconds 
to  detect  and  recognize  highways.  It  takes  approximately  a total  of  *i2 
seconds  to  recognize  highways,  rivers  and  bridges. 

Concerning  computer  memory  space,  there  is  another  advantage  of 
LOGICAL  programming  in  that  every  transformed  pixel  takes  only  one  byte 

j 

for  storage.  Usually  each  pixel  takes  8 bytes  (real  number)  or  bytes 
for  storage  (integer).  Use  of  the  logical  programming  saves  memory  space 
approximately  75%  comparing  with  the  one  using  four  bytes  storage  for  each 
pixel . 

The  proposed  syntax-directed  method  for  land-use  classification  has 
the  advantages  of  fast  processing  time  and  rather  accurate  results.  It 
can  be  easily  extended  to  image  segmentation  problems. 
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THE  USE  OF  CONTEXTUAL  INFORMATION  IN  STATISTICAL  CLASSIFICATION 
K.S.  Fu,  P.H.  Swain,  T.S.  Yu,  and  W.  Pfaff 

I.  Introduction 


In  previous  reports  we  have  Introduced  compound  decision  theory  [1]  as  a 
means  to  use  contextual  Information  in  the  classification  of  remote  sensing 
data.  The  compound  decision  rule  is  to  choose  a^,  for  each  individual  cell  k. 


which  minimizes 


* ,l(V  \>  P«JV  G(V 

®k  1 


(1) 


Note  that  X is  the  set  of  pattern  vectors  for  all  cells  in  the  imagery  frame. 
~n 


The  compound  decision  uses  all  the  measurements  from  the  image  frame  to  esti- 


mate the  state  of  nature  6, . The  cells  in  the  data  frame  are  classified 

k 


simultaneously  even  though  the  decision  rule  is  defined  for  each  celi. 

Suppose  we  reduce  the  data  frame  to  a small  window  size,  say,  the  3x3 
square  window  (Fig.  1),  the  decision  rule  follows  from  (l)  and  is  to  choose 


a.  to  minimi ze 
k 


all  e 


I L(0^,  a^)  P (xj  ,x^ , • . .Xg 1 0^)  6(6^) 


(2) 


If  we  classify  every  window  from  a finite  set  of  possible  occurrences  of 
windows,  the  classification  of  the  center  cell,  namely  cell  1 in  Fig.  1,  is 
then  a result  of  the  (spectral)  properties  of  its  neighbors  as  well  as  its 
own.  Thus,  the  "context"  defined  in  the  scene  contributes  to  the  classifica- 
tion. 

1 1 . Preliminary  Solution 

Immediately  we  see  that  the  calculation  of  the  joint  probability  for  the 
cells  In  the  window  is  necessary.  Probability  measure  defined  on  multi- 
dimensional space  is  in  the  theory  of  random  field.  We  do  not  intend  to  go 


8 

3 

7 

k 

1 

2 
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Figure  1 Neighbors  of  cell  1 

into  the  theoretical  development  but  rather  we  give  the  research  result. 

Suppose  we  assume  the  probability  distribution  belongs  to  an  exponential 
family.  And  suppose  that  the  dependence  is  from  cliques"  containing  no  more 
than  two  cells.  The  joint  normal  density  function  for  any  n cells  in  our 
window  is  [2] 

-in  1 

P(xr.xn)  = 2-no2)  2 | B | 2 exp  {-Io'2(X-U)T  B (X-U)  } (3) 


where  U is  nxl  vector  of  arbitrary  means,  p.,  and  B is  the  nxn  matrix  whose 
diagonal  elements  are  unity  and  whose  off-diagonal  elements  are  - B . j . We 
require  B to  be  positive  definite. 

One  important  feature  about  this  formulation  is  that  the  window  is  not 

restricted  to  any  particular  shape.  As  long  as  we  know  the  neighboring  cells 

to  be  used  to  provide  context  relationship,  it  is  possible  to  construct  the 

2 

joint  function.  The  unknown  parameters  in  (3)  are  B matrix  and  a( . In  a sub- 
sequent report  we  shall  show  estimation  of  unknown  parameters  for  a particular 
window  (nearest  neighbor  scheme)  and  present  some  results. 


*CHque:  Any  set  of  cells  which  either  consists  of  a single  cell  or  else  fn 

which  every  cell  is  a neighbor  of  every  other  cell  in  the  set 
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III.  Oata  Simulation 

The  algorithms  we  are  Investigating  ordinarily  require  various  assumptions 

concerning  the  statistical  characteri sties  of  the  data.  For  example,  it  is 

often  assumed  that  the  data  is  class-cond? tional ly  distributed  multivariate 

« 

normal  and,  when  classification  is  involved,  that  the  data  used  to  train  the 
classifier  Is  truly  representative  of  the  classes  of  interest.  It  is  further 
assumed  that  all  classes  present  In  the  scene  are  known  and  represented.  Also, 
It  Is  usually  assumed  that  each  pixel  consists  entirely  of  one  and  only  one 
of  the  classes. 

In  practice  these  assunptions  are  never  exactly  satisfied.  Thus,  when  we 
evaluate  the  results  of  applying  our  algorithms  to  any  real  data  set,  we  are 
faced  with  uncertainty  as  to  whether  the  errors  we  observe  are  due  to  short- 
comings of  the  algorithms  per  se  or  to  some  assumptions  we  made  about  the  data 
in  order  to  apply  the  algorithms.  We  can  not  then  determine  effectively  where 
the  strengths  and  weaknesses  of  our  approach  lie. 

The  purpose  of  this  data  simulation  activity,  therefore,  is  to  make  avail- 
able data  sets  with  controlled  characteristics.  In  particular,  to  begin  with 
we  shall  generate  simulated  LANDSAT  multispectral  data.  The  inputs  to  the 
simulation  will  be  high-quality  classifications  of  selected  ground  scenes  to- 
gether with  the  estimated  mean  vectors  and  covariance  matrices  of  the  classes 
In  the  scene.  In  the  output,  the  multispectral  data  values  will  be  regenerated 
so  that  the  classes  are  in  fact  multivariate  normally  distributed  and  there 
are  no  mixture  pixels.  Since  we  will  have  started  with  an  accurate  classifi- 
cation, the  spatial  characteristics  of  the  data  will  closely  match  those  of 
the  actual  ground  scene,  but  the  multispectral  character  1st les  will  be  con- 
trolled to  meet  the  multivariate  normal  assumption. 

This  data  will  be  used  specifically  to  test  our  methods  for  extracting 


' -T-iw  4.+ HIT-'  W- 
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contextual  '"(formation  (which  Is  primarily  of  a spatial  orientation)  for  use 
In  classification.  In  the  next  report  we  shall  detail  the  methods  used  to 
accomplish  the  simulation. 
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TWO-DIMENSIONAL  COMPLEX  CEPSTRUM 
B.  O'Connor  and  T.  S.  Huang 


I . Introduction 

This  report  will  present  the  details  on  the  theory  and  implementation  of 
two-dimensional  complex  cepstral  analysis  referred  to  In  the  last  report. 
Possible  uses  of  this  algorithm  include  tests  for  stability  of  two-dimensional 
digital  recursive  filters  and  enhancement  of  Images  blurred  by  special  point- 
spread  functions.  Cannon  [1]  has  used  the  cepstral  signature  of  a blurred 
picture  to  determine  the  extent  and  the  type  of  degradation.  However,  he 
allowed  only  three  kinds  of  blurs;  linear  motion,  defocus,  and  atmospheric. 
These  degradations  have  real  point-spread  functions  and  their  zero  crossings 
(phase  information)  characterize  them  completely.  In  fact,  since  his  method 
used  the  cepstrum,  as  opposed  to  the  complex  cepstrum,  zero  crossings  give  the 
only  phase  information  about  the  blur  and  hence  this  technique  cannot  be  em- 
ployed to  deblur  more  complicated  degradations.  The  algorithm  to  be  reported 
here  will  allow  the  calculation  of  the  complex  cepstrum  of  a picture.  The  ad- 
vantage of  the  complex  cepstrum  Is  that  it  uses  both  the  magnitude  and  phase 
of  the  input  picture  and  hence  allows  a complete  picture  description  in  the 
cepstral  domain.  Hopefully,  this  description  will  give  cepstral  signatures 
to  more  complicated  blurring  functions. 

The  complex  cepstrum  can  be  employed  to  test  the  stability  of  general 
two-dimensional  recursive  filters.  The  complex  cepstrum  of  a state  two-dimen- 
sional recursive  filter  is  non-zero  only  in  a certain  distinguished  regions 
of  the  cepstral  plane.  Ekstrom  and  Woods  have  obtained  similar  results  em- 
ploying the  cepstrum  (not  the  complex  cepstrum). 


i : 
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1 1 . 2-D  Complex  Cepstrum  Theory 

Let  x(m,n)  be  a H x N sequence  of  real  numbers.  The  complex  cepstrum  of 
x(m,n)  called  cx(m,n)  is  defined  as  the  inverse  z-transform  of  the  complex 
logarithm  of  the  z-transform  of  x(m,n).  Let  XCz^z^)  be  the  z-transform  of 

A A 

x(m,n)  and  X(zj,z2)  - log  X(z^,z2).  X(zj,z2)  must  be  a valid  z-transform  for 
cx(m,n)  to  exist.  In  order  for  cx(m,n)  to  be  uniquely  defined,  a region  of 

A 

convergence  must  be  chosen  for  XtzpZ^).  Assume  x(m,n)  and  cx(m,n)  are  real, 
stable  sequences  so  that  the  regions  of  convergence  of  both  X(zj»z2)  and 

A 

X(zj,z2)  include  the  unit  polydisc  and  hence,  their  Fourier  Transforms  exist. 


jw,  ju,  ju,  ju  ju  ju, 

X (e  ',  e 2)  - XR(e  \ e 2)  + j X^e  ',  e 2) 


_ . * jw,  ju,  * ju,  ju, 

X (e  ',  e *)  «=  XR(e  ',  e Z)  + j XJ(e  e Z) 


ju,  ju 


Since  cx(m,n)  is  real,  then  XR  must  be  an  even  function  of  (uj,^)  and  X^. 

A 

must  be  an  odd  function  of  (u^.Uj).  More  importantly  since  X is  analytic  then 
it  must  be  a continuous  function  of  (uj,^).  Hence,  since 


A - i*<A  Ai  • 


J\  A) 


implies  that 


so 


A JU,  JU  JU  JU  JU,  JU 

X(e  , e 2)  - log  |X(e  , e 2)|  + jarg[X(e  , e 2)] 


a ju,  ju  ju,  ju 

XR (e  , e Z)  - log  | X (e  , e Z) 


A ju  ju  ju  ju 

XJ(e  1 , e 2)  - arg[X(e  1 , e 2)j 
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must  be  continuous  functions  of  (u^ . However,  continuity  of  Xj  Is  depen- 
dent on  the  definition  of  the  complex  logarithm.  Here  a difficulty  arises 


since  the  complex  logarithm  is  not  a unique  transformation.  This  difficulty 

j“,  j<*>- 

cannot  be  resolved  by  using  ARG[X(e  , e ^)]  because  ARG  is  a discontinuous 


function. 

One  approach  for  obtaining  a continuous  phase  Is  to  integrate  its  deri- 
vative. The  resultant  phase  curve  is  called  the  unwrapped  phase.  Below,  the 
phase  derivative  is  calculated  in  terms  of  easily  obtained  quanties.  Starting 
with 


3 i 3 X (z  ■ »Z_ ) 

3^7  1o9[X(zPZ2)]  " X(z~2T Se- 


this Implies  that 


; : j<*>, 

3 * J“>,  jo)  ax/  J 1 2»  jo).  jw 

^7  *('  - 2>-8X(e  >/»(«  '.«  2> 


jw,  jw. 


3ci»1  M*  ® ^ +J<§uj  Xj(e  '*  ® ^ 


JW.  Ju>„ 


hence 


ju 


j“,  jw. 


da). 


arg[X(e  , e 2)]  - -gjjj-  X^e  \ e 2) 


X.  . Xr  - Xr  • XB 

R 3(Uj  I I 3u>j  R 


2 2 
X + X 


with  boundary  condition 


jw,  jw, 

arg[X(e  , e 2)] 


u>j  " 0 


- 0 


u>2  “ 0 


. <-  ^ ...  - 


i 


Note  that  In  some  cases  It  will  be  necessary  to  change  the  signs  of  all  x(m,n) 
before  using  this  boundary  condition.  The  expression  for  the  phase  derivative 
can  be  calculated  directly  from  the  sequence  x(m,n)  if  we  employ  the  following 
relation. 

8 V/.J“  1 ju)2>  3 „ , j"l  J"2W.  3 y,J“l  J“2, 

3^-  X(e  • e > ’ 3S7  XR  e - e > +J  3^  XJ(e  • e > 


-j  FT[m*x(m,n)] 


A similar  formula  can  be  derived  for  the  partial  derivative  of  the  phase  with 


respect  to  c^. 

An  efficient  phase  unwrapping  algorithm  has  been  proposed  recently  by 
J.  Trlbolet  [3].  The  algorithm  uses  an  adaptive  numerical  integration  scheme 
that  combines  the  information  contained  in  both  the  phase  derivative  and  the 
principal  value  of  the  phase.  Each  phase  estimate  is  formed  by  numerical  in- 
tegration of  the  phase  derivative  using  a given  step  interval.  This  step  in- 
terval is  adapted  until  the  principal  value  of  the  resultant  phase  estimate 
does  not  significantly  differ  from  the  known  principal  value  of  the  phase  at 
that  frequency.  This  method  has  been  extended  to  two  dimensions  in  our  pre- 
sent work.  The  basic  idea  of  the  algorithm  in  two  dimensions  is  to  calculate 
the  Integral  of  the  partial  derivatives  by  numerical  means  using  the  trape- 
zoidal Integration  rule.  Assuming  the  unwrapped  phase  of  (wgi,UJ02^  *s  known, 
the  estimate  of  the  unwrapped  phase  at  (qjjjrioqj)  is  given  by 


_ _ _ J'1*]  I J***/»5 

arg[X(e  H,  e °2)]  - arg[X(e  °\e  02)] 


+ !!iiZ2L  .{JL  arg[X(eJW°\  e^02) 
+ ■^IX(eJ“n.  eJ“02)]} 


Wl»." 
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i 


A similar  equation  is  true  in  the  direction.  Clearly,  this  estimate  im- 
proves as  the  step  Interval  becomes  smaller.  The  basic  idea  of  this  algorithm 
is  to  adapt  the  step  size  until  the  result  of  the  numerical  integration 
matches  the  information  given  by  the  principal  value  of  the  phase  [3]. 

We  shall  say  that  the  step  size  Aw  = Ujj-Uqj  leads  to  a consistent  phase 
estimate  of  (uj)*uq2^  ^ 

lE((worU)02)  * (“il*“02))l  < < 11 


Where  E measures  the  difference  between  the  principal  values  of  the  phase  and 
its  estimate,  that  is 


J“ll  J“02, 


E«»o,.“o2>  " <“l)»“o2,)  SARGIX(<!  . e » 


Jw| I Junj 

- ARG[X(e  11 , e 02)] 


Otherwise  the  step  size  must  be  reduced  in  order  to  obtain  a more  reliable 
estimate  of  the  phase.  As  soon  as  a reliable  estimate  is  found  the  unwrapped 
phase  is  defined  by 


jwn  >n9  .)<*>,,  jr-- 

arg[X(e  '',e  °2)]  - arg[X(e  n,  e 02)] 


E((u>01,u)02)  * (“ir“02) 


so  i t unwraps  to 


arg[X(eJU>n,  e^02)] 


without  error. 

If  is  comforting  to  know  that  the  existence  of  cepstra  for  2-D  rational 
polynomials  has  been  proved  by  Dudgeon  [4].  In  general,  if  x(m,n)  has  a 

Jr.  .ir- 
rational z-transform  then  the  phase  associated  with  X(e  , e ) has  a 


‘ 

\ 

1 


linear  component  plus  a continuous,  odd,  periodic  component.  From  this  infor- 
mation it  can  be  shown  that  any  2-D  array  having  a rational  z-transform  will 

ju.  jo>2 

have  a well  defined  2-D  complex  cepstrum  If  X(e  , e ) »*  0 for  all 
and  if  the  linear  phase  components  have  been  eliminated  by  an  appropriate  shift 
of  the  original  array.  On  this  issue  of  linear  phase,  there  is  a significant 
departure  from  one-dimensional  theory  in  that  the  linear  phase  component  can- 
not be  completely  removed  merely  by  shifting  x(n,m)  by  an  appropriate  amount 
[6]. 

III.  Summary 

The  above  two-dimensional  phase  unwrapping  algorithm  and  complex  cepstrum 
computer  programs  have  been  written  using  the  two-dimensional  FFT  to  approxi- 
mate the  Fourier  Transforms.  These  programs  have  been  applied  successfully  in 
determining  stability  of  two-dimensional  recursive  filters.  Future  work  will 
include  extending  this  program  so  that  the  cepstrum  of  a picture  can  be  cal- 
culated on  a PDP-1 1/1*5. 
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IMAGE  RESTORATION:  COMPARISON  OF  THE  PROJECTION 

METHOD  WITH  SINGULAR  VALUE  DECOMPOSITION  (SVD) 

S.  P.  Berger  and  T.  S.  Huang 


I.  Introduction 

This  report  contains  a comparison  of  two  techniques  of  Image  restoration. 
The  projection  method  and  the  singular  value  decomposition  (SVD)  approach  were 
applied  to  a simple  two-dimensional  image  which  was  blurred  by  a linear  degra- 
dation. White  Gaussian  noise  was  added  to  the  degraded  image.  The  two  methods 
were  compared  for  varying  amounts  of  noise. 

Before  presenting  the  computer  results,  it  will  be  helpful  to  give  the 
basic  ideas  of  the  two  methods. 

II.  The  Projection  Algorithm 

The  projection  method  requires  that  the  degradation  process  be  represented 
in  discrete  form.  A two-dimensional  Image  is  represented  as  a one-dimensional 
matrix  by  stacking  the  rows  of  pixel  values  into  a column  vector.  The  degrada- 
tion is  then  described  by 
9, 


’1 


all  f1  + a12  f2  + *•*  + aln  fn  + nl 


92  " a21  fl  + a22  f2  + *•*  + a2n  fn  + n2 


9 ■ a , f,  + a , f,  + ...  + a f + n 
n nl  1 n2  2 nn  n n 

where  f_  is  an  nxl  matrix  (column  vector)  representing  the  original  image,  £ Is 
the  degraded  image,  n is  a noise  vector,  and  A is  the  degrading  matrix.  For 
the  sake  of  simplicity,  we  assume  that  the  original  image  is  a square  array  of 
NxN  elements.  So  the  vector  £ contains  N*N  « n elements.  The  matrix  equation 


I 


I 


I 

t ■ 


: ; 

* 


I 

fci 


.. 


no 


Let  us  neglect  the  noise  vector  for  the  moment.  The  restoration  problem 
is  to  obtain  f.  from  £,  with  /^unknown.  The  projection  method  is  an  iterative 
procedure  for  solving  for  the  original  image  f. 

Each  equation  represents  a hyperplane  in  an  n-dimensionai  space.  The  de- 
graded image  £ is  a point  in  this  space.  The  projection  method  starts  at  some 
initial  guess  (usually  £) , and  finds  the  projection  onto  the  first  hyperplane. 
This  point  is  then  projected  onto  the  second  hyperplane,  and  so  on,  until  the 
last  projection  is  onto  the  nth  hyperplane.  This  then  completes  one  major 
iteration.  A projection  is  made  onto  the  first  hyperplane  to  start  the  next 
major  iteration. 

If  a unique  solution  exists,  the  algorithm  will  yield  vectors  which  con- 
verge to  it.  If  no  such  solution  exists,  the  method  still  provides  useful 
results.  The  effect  of  additive  noise  tends  to  increase  with  more  iterations. 
The  decision  must  then  be  made  as  to  how  many  iterations  to  perform.  This  is 
a subjective  determination  of  which  iteration  has  yielded  the  best  restored 
image. 

III.  Singular  Value  Decomposition 

The  singular  value  decomposition  (SVD)  approach  is  based  on  a representa- 
tion of  the  pseudo- inverse  of  a matrix.  The  original  image  f_  can  be  estimaged 


A + + + 
by  f_ “A  £,  where  A is  the  Moore-Penrose  pseudo- Inverse.  Now  A can  be 

+ R * T 

obtained  by  A «■  !!  ^ ^1/2  V^U.  , where  R is  the  rank  of  A_,  and  V.  are 


i=l  {Xi? 


i 


-i: 

'T  T 

the  eigenvectors  of  A A and  A respectively,  and  the  X..  are  the  eignevalues 


of  either  (called  the  singular  values  of  A).  The  A.  are  in  the  order  of  de- 


creasing magnitudes. 


The  restoration  Is  then  represented  as  f * l 

i-1 


(X,) 


1/2 


I,  Uj  * The 


user  must  decide  on  the  optimal  value  of  P,  since  the  effect  of  noise  will 
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dominate  the  summation  after  a certain  number  of  terms.  This  effect  is  demon- 
strated in  the  restoration  equation  f « A+  £ = — I 2.  + “ A*  A.  £ * £• 

The  first  term,  £ , is  the  minimum-norm  least-square  estimate  without 

noise,  and  the  second  is  the  noise  term.  The  larger  the  value  of  P,  the  closer 

the  first  erms  is  to  the  original  image.  The  noise  term,  however,  increases 
1 


as 
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with  P. 
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IV  Computer  Results 

Both  of  these  methods  were  applied  to  a problem  given  in  Huang  and 
Narendra  [1],  An  8x8-plxel  image  of  the  number  "5"  was  used  In  the  test 
(Fig.  1).  The  blank  area  has  a value  0,  and  the  dark  region  has  the  value  7. 
This  Image  was  degraded  by  replacing  each  point  with  the  average  of  the  3x3 
array  containing  the  point  in  the  center.  The  degraded  image  is  given  in  Fig.  2 

This  degradation  results  In  a sparse  matrix,  A,  (64x64)  containing  only 
the  values  0 and  1/9.  On  the  border,  this  choice  of  a_  reflects  the  assumption 
that  the  edge  on  the  original  image  which  lay  outside  the  8x8  array  had  the 
value  zero.  The  treatment  of  the  degradation  at  the  borders  of  the  image  does 
not  affect  the  values  of  ^considerably.  However,  the  operation  of  the  SVD 
method  Is  greatly  affected  for  some  unknown  reason.  Evidently,  some  difficulty 
arises  in  the  calculation  of  the  singular  values.  The  matrix  £ which  was  given 
earlier,  though,  causes  no  problems  in  the  operation  of  the  SVD  method. 

The  two  methods  were  compared  for  different  amounts  of  zero-mean  additive 
Gaussian  noise.  The  degraded  image  plus  noise  that  has  a standard  deviation 
of  o • 0.1  Is  shown  in  Fig.  3.  The  Image  restored  by  the  SVD  after  48  terms 
Is  given  In  Fig.  4a.  The  projection  algorithm  result  after  15  iterations  is 
shown  In  Fig.  4b.  This  restoration  is  about  equivalent  to  the  SVD.  The  pro- 
jection algorithm  can  Include  a priori  information.  If  the  Image  vector  f_ 


Is  restricted  to  positive  values  after  each  Iteration,  the  restoration  Improves 
markedly  (Fig.  4c, d).  From  Fig.  4a,b,c,d,  it  Is  evident  that  the  projection 
algorithm  with  the  positivity  constraint  yieids  a better  restoration  for  this 
level  of  noise. 

In  Fig.  5 a,b  the  noise  is  Increased  to  a = 0.5.  The  best  results  from 
each  of  the  two  methods  is  presented.  The  choice  between  the  two  is  difficult. 
The  projection  method  image  is  "cleaner",  but  the  SVD  resuit  is  less  "checkered" 
in  appearance.  For  the  latter  reason,  the  SVD  image  is  probably  preferable. 

The  noise  level  in  Fig.  6a, b is  a = 1.0.  The  two  results  are  similar. 

But  the  SVD  result  is  slightly  better  for  the  same  reasons  as  stated  for 
Fig.  5.  In  Fig.  7a, b,  both  methods  are  near  the  limit  of  performance  at  the 
noise  level  of  a - 1.5.  The  SVD  image  is  more  cluttered,  but  perhaps  is  more 
recognizable  as  a "5". 

The  "error"  between  the  original  and  restored  images  was  recorded  along 

n * 2 

with  the  visual  results.  The  error  was  calculated  as:  Error  = l (f  -f.)  , 

i = l ' ' 

where  n * 64  for  this  case.  It  is  doubtful  whether  this  error  is  a useful 
quantity  in  the  comparison  of  restored  images.  For  example,  in  one  instance  of 
the  projection  method  an  increase  of  13%  occurred  between  the  25th  and  70th 
iterations.  However,  the  restored  image  improved  siightly  at  iteration  70. 

So  the  image  with  the  least  error  (as  calculated  above)  is  not  necessarily 
the  best. 

The  computer  time  required  by  the  two  methods  was  also  observed.  For  this 
problem  the  projection  algorithm  required  0.16  seconds  for  preliminary  calcu- 
lations and  0.173  seconds  for  each  iteration.  The  time  required  to  generate 
the  pictures  was  not  included.  The  SVD  took  about  9.0  seconds  to  read  the 
eigenvectors  and  singular  values  from  magnetic  tape.  Only  0.08  seconds  were 
needed  to  generate  the  48th  term.  However,  about  60  seconds  were  needed  to 
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generate  the  48th  term.  However,  about  60  secs,  were  needed  to  generate  the 
eigenvectors  and  singular  values  and  store  them  on  magnetic  tape.  Once  the 
eigenvectors  are  generated,  they  can  be  used  In  the  SVO  restoration  of  any 
Image  which  was  blurred  by  the  same  degradation.  The  major  time  requirement 
for  the  SVD  Is  the  retrelval  of  the  stored  Information. 

The  projection  algorithm  Is  seen  to  yield  better  results  than  the  SVD  for 
low  noise  levels.  The  SVD  gives  somewhat  better  restorations  than  the  pro- 
jection method  as  the  additive  noise  is  increased.  Once  the  eigenvectors  are 
stored,  the  SVD  computer  time  is  comparable  to  that  required  by  the  projection 
method.  It  Is  difficult  to  evaluate  the  computer  efficiency  of  the  SVD  for 
this  8x8  pixel  problem.  As  the  size  increases,  the  SVD  implementation  becomes 
much  more  difficult.  The  difficulty  In  the  projection  method,  on  the  other 
hand,  depends  more  on  the  extent  of  the  degrading  system  impulse  response. 

V.  Conclusions 

The  comparison  between  the  singular  value  decomposition  and  the  projection 
algorithm  has  yielded  some  mixed  results.  The  projection  algorithm  has  proven 
to  be  an  effective  method  of  restoration.  For  the  particular  example  presented 
in  this  report,  the  projection  method  is  definitely  better  than  the  SVD  for  low 
noise  levels.  However,  the  higher  levels  of  additive  noise  tend  to  reverse 
this,  and  the  SVD  gives  restored  images  whose  appearance  is  subjectively  better. 

Even  for  this  small  example,  the  computer  time  required  to  generate  the 
eigenvectors  and  singular  values  for  the  SVD  is  large.  One  of  the  advantages 
of  the  projection  algorithm  is  that  it  can  handle  large  images  much  more 
readily  than  the  SVD.  The  projection  method  is  also  more  versatile  in  that  it 
can  handle  a priori  information  about  the  original  image.  Both  methods  rely 
on  a subjective  determination  of  the  number  of  terms  or  iterations  required  to 


give  the  best  restoration. 
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The  next  report  wtl 1 contain  a one-dlmenslonal  problem  for  further  com- 
parison between  these  two  methods. 
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FOURIER  DESCRIPTORS  AND  THEIR  APPLICATION 
TO  AIRPLANE  SHAPE  ANALYSIS 

T.  Wallace  and  P.  A.  Wlntz 

In  our  previous  two  progress  reports,  we  have  discussed  the  practical 
Implementation  of  classification  of  shapes  using  Fourier  descriptors.  The 
basic  operations  of  scaling,  rotation,  and  changing  the  starting  point  of  a 
sampled  contour  have  been  discussed  from  the  frequency  domain  viewpoint,  fol- 
fowlng  the  theoretical  development  of  Granlund  [1],  The  major  difficulty  en- 
countered was  normalizing  these  factors  in  a way  which  preserved  all  of  the 
shape  information  in  the  space  domain  representation.  It  was  shown  that  the 
magnitude  information  is  more  important  than  the  phase  information  for  bi- 
laterally symmetric  contours  such  as  the  airplane  outlines  we  have  been  working 
with  recently.  While  simple  magnitude  normalization  avoids  the  problems  assoc- 
iated with  defining  a unique  orientation  and  starting  point,  some  Information 
is  necessarily  lost. 

Some  normalization  schemes  which  have  been  proposed  work  in  the  absence 
of  noise,  but  give  poor  results  when  applied  to  real  data.  The  simplest  normal- 
ization scheme  is  to  constrain  frequency  coefficients  A ( 1 ) and  A (2 ) to  some 
reference  phase,  performing  the  two  normalization  operations  to  achieve  this 
result.  For  some  data  sets,  this  method  is  less  successful  than  that  using 
the  magnitude  information  alone,  due  to  noise  which  perturbs  the  orientation 
and  starting  point  resulting  from  normalization.  We  have  now  developed  a 
method  of  FD  normalization  which  preserves  all  of  the  shape  information  while 
rejecting  noise  successfully.  In  order  to  reject  noise,  the  coefficients  used 
in  the  procedure  are  chosen  to  have  as  large  magnitudes  as  possible. 

First,  we  require  the  phases  of  the  two  largest  coefficients  to  be  zero. 

A ( 1 ) will  always  be  the  largest,  with  magnitude  unity  due  to  the  scale 
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normalization  procedure  which  defines  that  magnitude.  Let  the  second  largest 
coefficient  be  A(k).  (The  frequencies  of  the  coefficients  produced  by  an  FFT 
of  length  n range  from  -(n/2)  + 1 to  (n/2 ) ) . The  normalization  multiplicity 
m of  coefficient  A(k)  is  defined  as: 
m = ] m— 1 ] 

Theorem:  The  requirement  that  A ( 1 ) and  A(k)  have  zero  phase  angle  can  be  sat- 

isfied by  m different  orientation/starting  point  combinations. 

PROOF:  Use  the  two  allowable  operations  to  arrive  at  one  orientation  and  start- 

ing point  which  gives  zero  phase  for  A(l)  and  A(k).  Next  use  the  starting  point 
movement  operation  (multiplication  of  the  ith  coefficient  by  exp  ( I • J*T))  to 
move  the  starting  point  once  around  the  entire  contour.  To  accomplish  this  I 
must  range  from  0 to  2ir.  Now  consider  the  two  cases  k positive  and  k negative. 
If  k is  positive,  the  phase  of  A(l)  and  A(k)  will  coincide  at  k-I  different 
starting  points.  But  at  each  of  these  starting  points,  we  can  use  the  orien- 
tation operation  (multiplication  of  each  coefficient  by  exp  ( j • j0))  to  reduce 
the  phases  to  zero.  Similarly,  if  k is  negative,  the  phases  of  A(l)  and  A(k) 
will  coincide  at  1-k  different  starting  points.  Again,  the  orientation  oper- 
ation can  reduce  the  phases  to  zero. 

Note  that  if  k=2,  the  orientation  and  starting  point  are  defined  uniquely. 
In  general,  however,  A(2)  will  not  be  the  second  largest  coefficient  in  magni- 
tude so  this  ambiguity  must  be  resolved  to  achieve  a general  procedure. 

The  obvious  method  of  solving  this  problem  is  to  check  the  phase  of  a 
third  coefficient  A(p)  at  each  of  the  m possible  orientation/starting  point 
combinations  and  choose  the  normalization  which  gives  a phase  closest  to  zero 
for  this  coefficient.  However,  this  ambi gul ty-resol ving  coefficient  cannot 
be  chosen  arbitrarily.  If  the  normalization  multiplicity  of  coefficient  A (p) 

Is  the  same  as  that  of  A(k),  or  a multiple  of  it,  the  phase  of  A(p)  will  be 


the  same  at  each  possible  normalization!  If  m for  coefficient  A(p)  (denoted 
m[p])  is  a factor  of  m[k],  or  a multiple  of  a factor  of  m[k]  less  than  m[k], 
there  Is  also  ambiguity  since  some  of  the  m possible  normalizations  will 
result  in  Identical  phases  for  A(p).  If  these  ambiguous  coefficients  are 
removed  from  consideration,  and  the  unambiguous  coefficient  with  the  largest 
magnitude  is  used  to  select  one  of  the  m allowable  normal fzations,  a general 
procedure  Is  obtained. 

To  briefly  review  the  entire  normalization  procedure,  we  start  by  divid- 
ing each  coefficient  by  the  magnitude  of  A ( 1 ) to  normalize  the  size  of  the 
contour.  We  find  the  coefficient  of  second  largest  magnitude  and  compute  its 
normalization  multiplicity.  We  then  locate  the  third  largest  coefficient  suit- 
able for  resolving  the  ambiguity  (A (p ) ) as  explained  above.  The  orientation 
and  starting  point  are  adjusted  to  satisfy  the  restrictions  that  A ( I ) and  A (k ) 
are  real  and  positive,  and  A(p)  has  phase  as  close  to  zero  as  possible. 

This  method  is  quite  powerful,  but  a slight  modification  in  the  procedure 
has  been  found  helpful  in  those  cases  in  which  there  are  two  or  more  coeffi- 
cients suitable  for  use  as  A(p)  with  almost  the  same  magnitude.  It  is  very 
unlikely  that  the  magnitudes  will  be  identical,  but  if  they  are  even  close, 
noise  may  cause  one  of  them  to  be  used  to  normalize  the  test  FD,  and  the  other 
to  normalize  the  unknown  FD.  To  overcome  this,  the  ambiguity-resolving  co- 
efficient used  to  normalize  the  test  FD  can  be  supplied  to  the  normalization 
subroutine  directly,  rather  than  having  the  subroutine  compute  it. 

To  investigate  the  classification  accuracy  using  this  method  as  opposed 
to  just  using  the  magnitudes,  the  experiment  described  in  our  last  progress 
report  was  performed  both  ways.  Briefly,  the  experiment  Involved  classifying 
20  aircraft  contours  quantized  to  a 6l*x6k  grid  using  a test  set  consisting  of 
the  same  aircraft  contours  but  quantized  to  a resolution  of  128x128.  Using 
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this  method,  the  classification  accuracy  was  1 00%  for  an  absolute  value  distance 

! 

measure,  and  95%  for  a Euclidean  distance  measure,  as  reported  previously. 

Using  only  the  magnitude  information,  classification  accuracy  dropped  to  95% 


and  90%  respectively. 

The  next  step  in  this  research  involves  using  the  BLOB  algorithm  to  locate 
the  outlines  of  objects  in  actual  photographic  data,  then  classifying  the 
shapes  using  the  Fourier  Descriptor  algorithm  described  above.  The  version  of 
BLOB  that  we  are  presently  working  with  differs  from  earlier  versions  in  that 
it  does  not  preferentially  find  outlines  in  any  given  direction.  There  is 
hence  very  little  dependency  on  the  direction  in  which  the  picture  was  scanned, 
and  no  consistent  false  contours  in  any  single  direction. 

The  major  problem  in  adapting  BLOB  to  this  application  involves  the 
statistical  assumptions  used  to  classify  pixel  groups  as  similar  or  dissimilar. 
The  original  BLOB  was  developed  to  classify  mul tispectral  data,  which  was 
assumed  normally  distributed.  This  assumption  is  generally  ineffective  in 
dealing  with  ordinary  photographic  data,  and  preliminary  results  reflect  this 
fact.  More  appropriate  statistical  assumptions  are  under  consideration  in 
order  to  improve  contour  location  accuracy. 
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LOCATING  AIRPORTS  IN  LANDSAT  IMAGERY 
X.K.  Dang  and  T.S.  Huang 

As  mentioned  in  our  report  (Nov.  75  - Jan.  76)  our  concern  is  to  separate 
airports  from  highways.  We  specify  our  visual  model  of  a runway  as  follows: 

l : 

1.  Spectral  properties  of  runways  and  highways  are  very  similar  corres- 
ponding to  concrete  materials. 

r 

2.  The  shape  of  a runway  is  different  from  a highway.  It  is  a strip  with 
the  following  properties: 

a)  Large  width 

b)  Maximal  length 

c)  No  curvature 


i 
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FUR  IMAGERY  TACTICAL  TARGET  DETECTION  AND  CLASSIFICATION 

0.  R.  Mitchell 

This  project  is  being  carried  out  in  cooperation  with  Honeywell's  System 
and  Research  Division,  Minneapolis,  Minnesota.  Honeywell  has  developed  a real- 
time target  cueing  system,  the  Autoscreener,  which  is  geared  to  the  detection 
of  tactical  targets  in  a rural  environment.  Purdue  has  agreed  to  assist  in 
developing  improved  techniques  for  this  device  to  allow  target  detection  in  an 
urban  environment  and  target  classification  in  both  rural  and  urban  environ- 
ments. 


The  present  direction  of  our  work  is  to  develop  algorithms  for  target 
classification  in  a rural  environment.  The  lack  of  urban  FUR  data  containing 
tactical  targets  has  led  us  to  concentrate  initially  on  the  rural  case.  Our 
overall  goal  is  to  automatically  rapidly  classify  objects  into  vehicles  and 
non-vehicles  and  to  further  classify  each  category  into  several  tactical 
classes  (e.g.  tank,  APC,  jeep,  truck). 

We  plan  to  use  statistical,  textural,  and  shape  information  in  segmenting 
a designated  target  into  subparts.  These  subparts  might  include:  motor  hot 
spot,  wheels  or  tracks,  body,  windshield,  turret,  etc.  Classified  subparts 
would  then  be  checked  against  a grammatical  description  of  desired  targets. 

A more  global  clue  which  might  be  helpful  in  a rural  classification  is  a 
knowledge  of  the  terrain  type  as  detected  by  a texture  measure.  For  example, 
the  shape  constraints  for  a target  in  a heavily  wooded  area  should  be  relaxed 
somewhat  since  some  parts  of  the  object  may  be  obscured  by  trees. 

The  inclusion  of  urban  environment  targets  will  depend  heavily  upon  image 
segmentation  and  background  classification.  The  shape  constraints  or  target 


* objects  will  allow  partial  target  obstruction  by  other  man-made  objects 


Moving  objects  may  be  detected  by  using  multiframe  change  detection. 

These  proposed  techniques  require  use  of  existing  segmentation,  feature 
extraction,  and  syntactic  algorithms;  the  development  of  new  methods  due  to 
the  unique  character  of  FUR  data;  and  the  implementation  of  this  software 
efficiently  in  a real-time  system. 

A data  set  has  been  selected  for  initial  work.  This  consists  of  25  digi- 
tized A80*5i2  images,  each  containing  one,  two,  or  no  targets.  Sample  tar- 
gets are  shown  in  Fig.  1.  Some  presently  available  algorithms  have  been  ap- 
plied to  these  images.  For  example,  the  texture  edge  algorithm  (see  report  in 
prior  section)  produced  the  output  shown  in  Figs.  2-A  and  the  statistical  con- 
tour following  algorithm  produced  the  contour  shown  in  Fig.  5. 


f igure  2 Texture  edges  in  Fig.  I (a)  found  by  comparing 
local  extrema  surrounding  each  part. 


Figure  3 Local  naxina  in  Fig.  2 superimposed 
fig.  Ha). 


on 


Figure  * Cach  horizontal  Interval  shown  nas  the 
correct  texture  and  intensity  to  be  a 
target  (original  data  In  Fig.  |(b). 
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QTY  Manufacturer 

3 Beehive  Elect. 

2 Tex.  Inst. 

1 Digi-Data 

I DEC 

1 DEC 

I Fabrltek 

I Versatek 

I Comtal 

I Data  Printer 

I True-Data 

I Tektronix 

1 DEC 


FACILITIES 

Description 

"Super-Bee"  Terminals 

"Silent  700"  Terminals 

Industry  standard  magnetic  tape  system; 

2,  9-track  and  I,  7-track  drives;  one  each 
NRZI  and  phase-encoded  formatters/controllers 

Dual -drive  DEC tape  unit 

RP03  disk  drive  (40  million  characters) 

96K-word  auxiliary  memory  system 
(64K  bought  by  ARPA,  32K  by  NASA) 

Electrostatic  matrix  printer 

Color  picture  display 

132  column,  600  L.P.M.  line  printer 

Punched  card  reader 

Model  4010,  g.-aphics  display 

PDP-11/45  computer  system;  system  includes: 
32K  memory 

FPP-U  floating  point  processor  (NSF  money) 

H960  extension  mounting  cabinet 

3 - small  peripheral  mountings  blocks  (DD-U) 

1 UNI  BUS  repeater/expander 

DH11,  16-1 Ine  terminal  multiplexor 

KW1 1 -p  programmed  clock 

"ANTS"  - type  PDP-11/ IMP  interface 


Note:  Our  PDP-11/45  is  currently  operating  under  the  UNIX  system. 
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